Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 9 de 9
Filter
Add more filters










Database
Language
Publication year range
1.
J Chem Inf Model ; 2024 Jul 15.
Article in English | MEDLINE | ID: mdl-39007724

ABSTRACT

Geometric deep learning models, which incorporate the relevant molecular symmetries within the neural network architecture, have considerably improved the accuracy and data efficiency of predictions of molecular properties. Building on this success, we introduce 3DReact, a geometric deep learning model to predict reaction properties from three-dimensional structures of reactants and products. We demonstrate that the invariant version of the model is sufficient for existing reaction data sets. We illustrate its competitive performance on the prediction of activation barriers on the GDB7-22-TS, Cyclo-23-TS, and Proparg-21-TS data sets in different atom-mapping regimes. We show that, compared to existing models for reaction property prediction, 3DReact offers a flexible framework that exploits atom-mapping information, if available, as well as geometries of reactants and products (in an invariant or equivariant fashion). Accordingly, it performs systematically well across different data sets, atom-mapping regimes, as well as both interpolation and extrapolation tasks.

2.
Digit Discov ; 3(5): 932-943, 2024 May 15.
Article in English | MEDLINE | ID: mdl-38756222

ABSTRACT

In recent years, there has been a surge of interest in predicting computed activation barriers, to enable the acceleration of the automated exploration of reaction networks. Consequently, various predictive approaches have emerged, ranging from graph-based models to methods based on the three-dimensional structure of reactants and products. In tandem, many representations have been developed to predict experimental targets, which may hold promise for barrier prediction as well. Here, we bring together all of these efforts and benchmark various methods (Morgan fingerprints, the DRFP, the CGR representation-based Chemprop, SLATMd, B2Rl2, EquiReact and language model BERT + RXNFP) for the prediction of computed activation barriers on three diverse datasets.

3.
J Chem Theory Comput ; 20(3): 1108-1117, 2024 Feb 13.
Article in English | MEDLINE | ID: mdl-38227222

ABSTRACT

Recently, we introduced a class of molecular representations for kernel-based regression methods─the spectrum of approximated Hamiltonian matrices (SPAHM)─that takes advantage of lightweight one-electron Hamiltonians traditionally used as a self-consistent field initial guess. The original SPAHM variant is built from occupied-orbital energies (i.e., eigenvalues) and naturally contains all of the information about nuclear charges, atomic positions, and symmetry requirements. Its advantages were demonstrated on data sets featuring a wide variation of charge and spin, for which traditional structure-based representations commonly fail. SPAHM(a,b), as introduced here, expand the eigenvalue SPAHM into local and transferable representations. They rely upon one-electron density matrices to build fingerprints from atomic and bond density overlap contributions inspired from preceding state-of-the-art representations. The performance and efficiency of SPAHM(a,b) is assessed on the predictions for data sets of prototypical organic molecules (QM7) of different charges and azoheteroarene dyes in an excited state. Overall, both SPAHM(a) and SPAHM(b) outperform state-of-the-art representations on difficult prediction tasks such as the atomic properties of charged open-shell species and of π-conjugated systems.

4.
Digit Discov ; 1(3): 286-294, 2022 Jun 13.
Article in English | MEDLINE | ID: mdl-35769206

ABSTRACT

Physics-inspired molecular representations are the cornerstone of similarity-based learning applied to solve chemical problems. Despite their conceptual and mathematical diversity, this class of descriptors shares a common underlying philosophy: they all rely on the molecular information that determines the form of the electronic Schrödinger equation. Existing representations take the most varied forms, from non-linear functions of atom types and positions to atom densities and potential, up to complex quantum chemical objects directly injected into the ML architecture. In this work, we present the spectrum of approximated Hamiltonian matrices (SPAHM) as an alternative pathway to construct quantum machine learning representations through leveraging the foundation of the electronic Schrödinger equation itself: the electronic Hamiltonian. As the Hamiltonian encodes all quantum chemical information at once, SPAHM representations not only distinguish different molecules and conformations, but also different spin, charge, and electronic states. As a proof of concept, we focus here on efficient SPAHM representations built from the eigenvalues of a hierarchy of well-established and readily-evaluated "guess" Hamiltonians. These SPAHM representations are particularly compact and efficient for kernel evaluation and their complexity is independent of the number of different atom types in the database.

5.
J Chem Phys ; 155(2): 024107, 2021 Jul 14.
Article in English | MEDLINE | ID: mdl-34266253

ABSTRACT

Machine learning (ML) algorithms have undergone an explosive development impacting every aspect of computational chemistry. To obtain reliable predictions, one needs to maintain a proper balance between the black-box nature of ML frameworks and the physics of the target properties. One of the most appealing quantum-chemical properties for regression models is the electron density, and some of us recently proposed a transferable and scalable model based on the decomposition of the density onto an atom-centered basis set. The decomposition, as well as the training of the model, is at its core a minimization of some loss function, which can be arbitrarily chosen and may lead to results of different quality. Well-studied in the context of density fitting (DF), the impact of the metric on the performance of ML models has not been analyzed yet. In this work, we compare predictions obtained using the overlap and the Coulomb-repulsion metrics for both decomposition and training. As expected, the Coulomb metric used as both the DF and ML loss functions leads to the best results for the electrostatic potential and dipole moments. The origin of this difference lies in the fact that the model is not constrained to predict densities that integrate to the exact number of electrons N. Since an a posteriori correction for the number of electrons decreases the errors, we proposed a modification of the model, where N is included directly into the kernel function, which allowed lowering of the errors on the test and out-of-sample sets.

6.
J Phys Chem Lett ; 12(25): 5957-5962, 2021 Jul 01.
Article in English | MEDLINE | ID: mdl-34157226

ABSTRACT

The ab initio determination of electronic excited state (ES) properties is the cornerstone of theoretical photochemistry. Yet, traditional ES methods become impractical when applied to fairly large molecules, or when used on thousands of systems. Machine learning (ML) techniques have demonstrated their accuracy at retrieving ES properties of large molecular databases at a reduced computational cost. For these applications, nonlinear algorithms tend to be specialized in targeting individual properties. Learning fundamental quantum objects potentially represents a more efficient, yet complex, alternative as a variety of molecular properties could be extracted through postprocessing. Herein, we report a general framework able to learn three fundamental objects: the hole and particle densities, as well as the transition density. We demonstrate the advantages of targeting those outputs and apply our predictions to obtain properties, including the state character and the exciton topological descriptors, for the two bands (nπ* and ππ*) of 3427 azoheteroarene photoswitches.


Subject(s)
Azo Compounds/chemistry , Coloring Agents/chemistry , Machine Learning , Quantum Theory , Models, Molecular , Molecular Conformation
7.
J Chem Phys ; 153(20): 204111, 2020 Nov 28.
Article in English | MEDLINE | ID: mdl-33261488

ABSTRACT

The on-top pair density [Πr] is a local quantum-chemical property that reflects the probability of two electrons of any spin to occupy the same position in space. Being the simplest quantity related to the two-particle density matrix, the on-top pair density is a powerful indicator of electron correlation effects, and as such, it has been extensively used to combine density functional theory and multireference wavefunction theory. The widespread application of Π(r) is currently hindered by the need for post-Hartree-Fock or multireference computations for its accurate evaluation. In this work, we propose the construction of a machine learning model capable of predicting the complete active space self-consistent field (CASSCF)-quality on-top pair density of a molecule only from its structure and composition. Our model, trained on the GDB11-AD-3165 database, is able to predict with minimal error the on-top pair density of organic molecules, bypassing completely the need for ab initio computations. The accuracy of the regression is demonstrated using the on-top ratio as a visual metric of electron correlation effects and bond-breaking in real-space. In addition, we report the construction of a specialized basis set, built to fit the on-top pair density in a single atom-centered expansion. This basis, cornerstone of the regression, could be potentially used also in the same spirit of the resolution-of-the-identity approximation for the electron density.

8.
Chimia (Aarau) ; 74(4): 232-236, 2020 Apr 29.
Article in English | MEDLINE | ID: mdl-32331538

ABSTRACT

Machine-learning in quantum chemistry is currently booming, with reported applications spanning all molecular properties from simple atomization energies to complex mathematical objects such as the many-body wavefunction. Due to its central role in density functional theory, the electron density is a particularly compelling target for non-linear regression. Nevertheless, the scalability and the transferability of the existing machine-learning models of ρ(r) are limited by its complex rotational symmetries. Recently, in collaboration with Ceriotti and coworkers, we combined an efficient electron density decomposition scheme with a local regression framework based on symmetry-adapted Gaussian process regression able to accurately describe the covariance of the electron density spherical tensor components. The learning exercise is performed on local environments, allowing high transferability and linear-scaling of the prediction with respect to the number of atoms. Here, we review the main characteristics of the model and show its predictive power in a series of applications. The scalability and transferability of the trained model are demonstrated through the prediction of the electron density of Ubiquitin.

SELECTION OF CITATIONS
SEARCH DETAIL
...