Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 28
Filter
Add more filters










Publication year range
1.
Nat Chem ; 16(5): 727-734, 2024 May.
Article in English | MEDLINE | ID: mdl-38454071

ABSTRACT

Atomistic simulation has a broad range of applications from drug design to materials discovery. Machine learning interatomic potentials (MLIPs) have become an efficient alternative to computationally expensive ab initio simulations. For this reason, chemistry and materials science would greatly benefit from a general reactive MLIP, that is, an MLIP that is applicable to a broad range of reactive chemistry without the need for refitting. Here we develop a general reactive MLIP (ANI-1xnr) through automated sampling of condensed-phase reactions. ANI-1xnr is then applied to study five distinct systems: carbon solid-phase nucleation, graphene ring formation from acetylene, biofuel additives, combustion of methane and the spontaneous formation of glycine from early earth small molecules. In all studies, ANI-1xnr closely matches experiment (when available) and/or previous studies using traditional model chemistry methods. As such, ANI-1xnr proves to be a highly general reactive MLIP for C, H, N and O elements in the condensed phase, enabling high-throughput in silico reactive chemistry experimentation.

2.
J Chem Theory Comput ; 20(3): 1274-1281, 2024 Feb 13.
Article in English | MEDLINE | ID: mdl-38307009

ABSTRACT

Methodologies for training machine learning potentials (MLPs) with quantum-mechanical simulation data have recently seen tremendous progress. Experimental data have a very different character than simulated data, and most MLP training procedures cannot be easily adapted to incorporate both types of data into the training process. We investigate a training procedure based on iterative Boltzmann inversion that produces a pair potential correction to an existing MLP using equilibrium radial distribution function data. By applying these corrections to an MLP for pure aluminum based on density functional theory, we observe that the resulting model largely addresses previous overstructuring in the melt phase. Interestingly, the corrected MLP also exhibits improved performance in predicting experimental diffusion constants, which are not included in the training procedure. The presented method does not require autodifferentiating through a molecular dynamics solver and does not make assumptions about the MLP architecture. Our results suggest a practical framework for incorporating experimental data into machine learning models to improve the accuracy of molecular dynamics simulations.

3.
J Chem Phys ; 159(11)2023 Sep 21.
Article in English | MEDLINE | ID: mdl-37712780

ABSTRACT

Catalyzed by enormous success in the industrial sector, many research programs have been exploring data-driven, machine learning approaches. Performance can be poor when the model is extrapolated to new regions of chemical space, e.g., new bonding types, new many-body interactions. Another important limitation is the spatial locality assumption in model architecture, and this limitation cannot be overcome with larger or more diverse datasets. The outlined challenges are primarily associated with the lack of electronic structure information in surrogate models such as interatomic potentials. Given the fast development of machine learning and computational chemistry methods, we expect some limitations of surrogate models to be addressed in the near future; nevertheless spatial locality assumption will likely remain a limiting factor for their transferability. Here, we suggest focusing on an equally important effort-design of physics-informed models that leverage the domain knowledge and employ machine learning only as a corrective tool. In the context of material science, we will focus on semi-empirical quantum mechanics, using machine learning to predict corrections to the reduced-order Hamiltonian model parameters. The resulting models are broadly applicable, retain the speed of semiempirical chemistry, and frequently achieve accuracy on par with much more expensive ab initio calculations. These early results indicate that future work, in which machine learning and quantum chemistry methods are developed jointly, may provide the best of all worlds for chemistry applications that demand both high accuracy and high numerical efficiency.

4.
Phys Chem Chem Phys ; 25(32): 21173-21182, 2023 Aug 16.
Article in English | MEDLINE | ID: mdl-37490276

ABSTRACT

The global energy optimization problem is an acute and important problem in chemistry. It is crucial to know the geometry of the lowest energy isomer (global minimum, GM) of a given compound for the evaluation of its chemical and physical properties. This problem is especially relevant for atomic clusters. Due to the exponential growth of the number of local minima geometries with the increase of the number of atoms in the cluster, it is important to find a computationally efficient and reliable method to navigate the energy landscape and locate a true global minima structure. Newly developed neural network (NN) atomistic potentials offer a numerically efficient and relatively accurate approach for molecular structure optimization. An important question that needs to be answered is "Can NN potentials, trained on a given set, represent the potential energy surface (PES) of a neighboring domain?". In this work, we tested the applicability of ANI-1ccx and ANI-nr NN atomistic potentials for the global minima optimization of carbon clusters Cn (n = 3-10). We showed that with the introduction of the cluster connectivity restriction and consequent DFT or ab initio calculations, ANI-1ccx and ANI-nr can be considered as robust PES pre-samplers that can capture the GM structure even for large clusters such as C20.

5.
J Chem Theory Comput ; 19(11): 3209-3222, 2023 Jun 13.
Article in English | MEDLINE | ID: mdl-37163680

ABSTRACT

Extended Lagrangian Born-Oppenheimer molecular dynamics (XL-BOMD) in its most recent shadow potential energy version has been implemented in the semiempirical PyTorch-based software PySeQM. The implementation includes finite electronic temperatures, canonical density matrix perturbation theory, and an adaptive Krylov subspace approximation for the integration of the electronic equations of motion within the XL-BOMB approach (KSA-XL-BOMD). The PyTorch implementation leverages the use of GPU and machine learning hardware accelerators for the simulations. The new XL-BOMD formulation allows studying more challenging chemical systems with charge instabilities and low electronic energy gaps. The current public release of PySeQM continues our development of modular architecture for large-scale simulations employing semi-empirical quantum-mechanical treatment. Applied to molecular dynamics, simulation of 840 carbon atoms, one integration time step executes in 4 s on a single Nvidia RTX A6000 GPU.

6.
J Chem Phys ; 158(18)2023 May 14.
Article in English | MEDLINE | ID: mdl-37158328

ABSTRACT

Atomistic machine learning focuses on the creation of models that obey fundamental symmetries of atomistic configurations, such as permutation, translation, and rotation invariances. In many of these schemes, translation and rotation invariance are achieved by building on scalar invariants, e.g., distances between atom pairs. There is growing interest in molecular representations that work internally with higher rank rotational tensors, e.g., vector displacements between atoms, and tensor products thereof. Here, we present a framework for extending the Hierarchically Interacting Particle Neural Network (HIP-NN) with Tensor Sensitivity information (HIP-NN-TS) from each local atomic environment. Crucially, the method employs a weight tying strategy that allows direct incorporation of many-body information while adding very few model parameters. We show that HIP-NN-TS is more accurate than HIP-NN, with negligible increase in parameter count, for several datasets and network sizes. As the dataset becomes more complex, tensor sensitivities provide greater improvements to model accuracy. In particular, HIP-NN-TS achieves a record mean absolute error of 0.927 kcalmol for conformational energy variation on the challenging COMP6 benchmark, which includes a broad set of organic molecules. We also compare the computational performance of HIP-NN-TS to HIP-NN and other models in the literature.

7.
J Phys Chem A ; 127(17): 3768-3778, 2023 May 04.
Article in English | MEDLINE | ID: mdl-37078657

ABSTRACT

Highly energetic electron-hole pairs (hot carriers) formed from plasmon decay in metallic nanostructures promise sustainable pathways for energy-harvesting devices. However, efficient collection before thermalization remains an obstacle for realization of their full energy generating potential. Addressing this challenge requires detailed understanding of physical processes from plasmon excitation in the metal to their collection in a molecule or a semiconductor, where atomistic theoretical investigation may be particularly beneficial. Unfortunately, first-principles theoretical modeling of these processes is extremely costly, preventing a detailed analysis over a large number of potential nanostructures and limiting the analysis to systems with a few 100s of atoms. Recent advances in machine learned interatomic potentials suggest that dynamics can be accelerated with surrogate models which replace the full solution of the Schrödinger Equation. Here, we modify an existing neural network, Hierarchically Interacting Particle Neural Network (HIP-NN), to predict plasmon dynamics in Ag nanoparticles. The model takes as a minimum as three time steps of the reference real-time time-dependent density functional theory (rt-TDDFT) calculated charges as history and predicts trajectories for 5 fs in great agreement with the reference simulation. Further, we show that a multistep training approach in which the loss function includes errors from future time-step predictions can stabilize the model predictions for the entire simulated trajectory (∼25 fs). This extends the model's capability to accurately predict plasmon dynamics in large nanoparticles of up to 561 atoms, not present in the training data set. More importantly, with machine learning models on GPUs, we gain a speed-up factor of ∼103 as compared with the rt-TDDFT calculations when predicting important physical quantities such as dynamic dipole moments in Ag55 and a factor of ∼104 for extended nanoparticles that are 10 times larger. This underscores the promise of future machine learning accelerated electron/nuclear dynamics simulations for understanding fundamental properties of plasmon-driven hot carrier devices.

8.
Nat Comput Sci ; 3(3): 230-239, 2023 Mar.
Article in English | MEDLINE | ID: mdl-38177878

ABSTRACT

Machine learning (ML) models, if trained to data sets of high-fidelity quantum simulations, produce accurate and efficient interatomic potentials. Active learning (AL) is a powerful tool to iteratively generate diverse data sets. In this approach, the ML model provides an uncertainty estimate along with its prediction for each new atomic configuration. If the uncertainty estimate passes a certain threshold, then the configuration is included in the data set. Here we develop a strategy to more rapidly discover configurations that meaningfully augment the training data set. The approach, uncertainty-driven dynamics for active learning (UDD-AL), modifies the potential energy surface used in molecular dynamics simulations to favor regions of configuration space for which there is large model uncertainty. The performance of UDD-AL is demonstrated for two AL tasks: sampling the conformational space of glycine and sampling the promotion of proton transfer in acetylacetone. The method is shown to efficiently explore the chemically relevant configuration space, which may be inaccessible using regular dynamical sampling at target temperature conditions.


Subject(s)
Fabaceae , Uncertainty , Glycine , Machine Learning , Molecular Dynamics Simulation
9.
Proc Natl Acad Sci U S A ; 119(27): e2120333119, 2022 Jul 05.
Article in English | MEDLINE | ID: mdl-35776544

ABSTRACT

Conventional machine-learning (ML) models in computational chemistry learn to directly predict molecular properties using quantum chemistry only for reference data. While these heuristic ML methods show quantum-level accuracy with speeds several orders of magnitude faster than traditional quantum chemistry methods, they suffer from poor extensibility and transferability; i.e., their accuracy degrades on large or new chemical systems. Incorporating quantum chemistry frameworks into the ML models directly solves this problem. Here we take the structure of semiempirical quantum mechanics (SEQM) methods to construct dynamically responsive Hamiltonians. SEQM methods use empirical parameters fitted to experimental properties to construct reduced-order Hamiltonians, facilitating much faster calculations than ab initio methods but with compromised accuracy. By replacing these static parameters with machine-learned dynamic values inferred from the local environment, we greatly improve the accuracy of the SEQM methods. Trained on molecular energies and atomic forces, these dynamically generated Hamiltonian parameters show a strong correlation with atomic hybridization and bonding. Trained with only about 60,000 small organic molecular conformers, the resulting model retains interpretability, extensibility, and transferability when testing on much larger chemical systems and predicting various molecular properties. Overall, this work demonstrates the virtues of incorporating physics-based descriptions with ML to develop models that are simultaneously accurate, transferable, and interpretable.

10.
Nat Rev Chem ; 6(9): 653-672, 2022 Sep.
Article in English | MEDLINE | ID: mdl-37117713

ABSTRACT

Machine learning (ML) is becoming a method of choice for modelling complex chemical processes and materials. ML provides a surrogate model trained on a reference dataset that can be used to establish a relationship between a molecular structure and its chemical properties. This Review highlights developments in the use of ML to evaluate chemical properties such as partial atomic charges, dipole moments, spin and electron densities, and chemical bonding, as well as to obtain a reduced quantum-mechanical description. We overview several modern neural network architectures, their predictive capabilities, generality and transferability, and illustrate their applicability to various chemical properties. We emphasize that learned molecular representations resemble quantum-mechanical analogues, demonstrating the ability of the models to capture the underlying physics. We also discuss how ML models can describe non-local quantum effects. Finally, we conclude by compiling a list of available ML toolboxes, summarizing the unresolved challenges and presenting an outlook for future development. The observed trends demonstrate that this field is evolving towards physics-based models augmented by ML, which is accompanied by the development of new methods and the rapid growth of user-friendly ML frameworks for chemistry.

11.
Nat Commun ; 12(1): 4870, 2021 08 11.
Article in English | MEDLINE | ID: mdl-34381051

ABSTRACT

Interatomic potentials derived with Machine Learning algorithms such as Deep-Neural Networks (DNNs), achieve the accuracy of high-fidelity quantum mechanical (QM) methods in areas traditionally dominated by empirical force fields and allow performing massive simulations. Most DNN potentials were parametrized for neutral molecules or closed-shell ions due to architectural limitations. In this work, we propose an improved machine learning framework for simulating open-shell anions and cations. We introduce the AIMNet-NSE (Neural Spin Equilibration) architecture, which can predict molecular energies for an arbitrary combination of molecular charge and spin multiplicity with errors of about 2-3 kcal/mol and spin-charges with error errors ~0.01e for small and medium-sized organic molecules, compared to the reference QM simulations. The AIMNet-NSE model allows to fully bypass QM calculations and derive the ionization potential, electron affinity, and conceptual Density Functional Theory quantities like electronegativity, hardness, and condensed Fukui functions. We show that these descriptors, along with learned atomic representations, could be used to model chemical reactivity through an example of regioselectivity in electrophilic aromatic substitution reactions.

12.
Chem Sci ; 12(30): 10207-10217, 2021 Aug 04.
Article in English | MEDLINE | ID: mdl-34447529

ABSTRACT

Phosphorescence is commonly utilized for applications including light-emitting diodes and photovoltaics. Machine learning (ML) approaches trained on ab initio datasets of singlet-triplet energy gaps may expedite the discovery of phosphorescent compounds with the desired emission energies. However, we show that standard ML approaches for modeling potential energy surfaces inaccurately predict singlet-triplet energy gaps due to the failure to account for spatial localities of spin transitions. To solve this, we introduce localization layers in a neural network model that weight atomic contributions to the energy gap, thereby allowing the model to isolate the most determinative chemical environments. Trained on the singlet-triplet energy gaps of organic molecules, we apply our method to an out-of-sample test set of large phosphorescent compounds and demonstrate the substantial improvement that localization layers have on predicting their phosphorescence energies. Remarkably, the inferred localization weights have a strong relationship with the ab initio spin density of the singlet-triplet transition, and thus infer localities of the molecule that determine the spin transition, despite the fact that no direct electronic information was provided during training. The use of localization layers is expected to improve the modeling of many localized, non-extensive phenomena and could be implemented in any atom-centered neural network model.

13.
J Chem Phys ; 154(24): 244108, 2021 Jun 28.
Article in English | MEDLINE | ID: mdl-34241371

ABSTRACT

The Hückel Hamiltonian is an incredibly simple tight-binding model known for its ability to capture qualitative physics phenomena arising from electron interactions in molecules and materials. Part of its simplicity arises from using only two types of empirically fit physics-motivated parameters: the first describes the orbital energies on each atom and the second describes electronic interactions and bonding between atoms. By replacing these empirical parameters with machine-learned dynamic values, we vastly increase the accuracy of the extended Hückel model. The dynamic values are generated with a deep neural network, which is trained to reproduce orbital energies and densities derived from density functional theory. The resulting model retains interpretability, while the deep neural network parameterization is smooth and accurate and reproduces insightful features of the original empirical parameterization. Overall, this work shows the promise of utilizing machine learning to formulate simple, accurate, and dynamically parameterized physics models.

14.
J Phys Chem Lett ; 12(26): 6227-6243, 2021 Jul 08.
Article in English | MEDLINE | ID: mdl-34196559

ABSTRACT

Machine learning (ML) is quickly becoming a premier tool for modeling chemical processes and materials. ML-based force fields, trained on large data sets of high-quality electron structure calculations, are particularly attractive due their unique combination of computational efficiency and physical accuracy. This Perspective summarizes some recent advances in the development of neural network-based interatomic potentials. Designing high-quality training data sets is crucial to overall model accuracy. One strategy is active learning, in which new data are automatically collected for atomic configurations that produce large ML uncertainties. Another strategy is to use the highest levels of quantum theory possible. Transfer learning allows training to a data set of mixed fidelity. A model initially trained to a large data set of density functional theory calculations can be significantly improved by retraining to a relatively small data set of expensive coupled cluster theory calculations. These advances are exemplified by applications to molecules and materials.

15.
Nat Commun ; 12(1): 1257, 2021 Feb 23.
Article in English | MEDLINE | ID: mdl-33623036

ABSTRACT

Machine learning, trained on quantum mechanics (QM) calculations, is a powerful tool for modeling potential energy surfaces. A critical factor is the quality and diversity of the training dataset. Here we present a highly automated approach to dataset construction and demonstrate the method by building a potential for elemental aluminum (ANI-Al). In our active learning scheme, the ML potential under development is used to drive non-equilibrium molecular dynamics simulations with time-varying applied temperatures. Whenever a configuration is reached for which the ML uncertainty is large, new QM data is collected. The ML model is periodically retrained on all available QM data. The final ANI-Al potential makes very accurate predictions of radial distribution function in melt, liquid-solid coexistence curve, and crystal properties such as defect energies and barriers. We perform a 1.3M atom shock simulation and show that ANI-Al force predictions shine in their agreement with new reference DFT calculations.

16.
Nat Nanotechnol ; 16(1): 63-68, 2021 01.
Article in English | MEDLINE | ID: mdl-33199882

ABSTRACT

Conical intersections (CoIns) of multidimensional potential energy surfaces are ubiquitous in nature and control pathways and yields of many photo-initiated intramolecular processes. Such topologies can be potentially involved in the energy transport in aggregated molecules or polymers but are yet to be uncovered. Here, using ultrafast two-dimensional electronic spectroscopy (2DES), we reveal the existence of intermolecular CoIns in molecular aggregates relevant for photovoltaics. Ultrafast, sub-10-fs 2DES tracks the coherent motion of a vibrational wave packet on an optically bright state and its abrupt transition into a dark state via a CoIn after only 40 fs. Non-adiabatic dynamics simulations identify an intermolecular CoIn as the source of these unusual dynamics. Our results indicate that intermolecular CoIns may effectively steer energy pathways in functional nanostructures for optoelectronics.

17.
J Chem Theory Comput ; 16(9): 5771-5783, 2020 Sep 08.
Article in English | MEDLINE | ID: mdl-32635739

ABSTRACT

We present a versatile new code released for open community use, the nonadiabatic excited state molecular dynamics (NEXMD) package. This software aims to simulate nonadiabatic excited state molecular dynamics using several semiempirical Hamiltonian models. To model such dynamics of a molecular system, the NEXMD uses the fewest-switches surface hopping algorithm, where the probability of transition from one state to another depends on the strength of the derivative nonadiabatic coupling. In addition, there are a number of algorithmic improvements such as empirical decoherence corrections and tracking trivial crossings of electronic states. While the primary intent behind the NEXMD was to simulate nonadiabatic molecular dynamics, the code can also perform geometry optimizations, adiabatic excited state dynamics, and single-point calculations all in vacuum or in a simulated solvent. In this report, first, we lay out the basic theoretical framework underlying the code. Then we present the code's structure and workflow. To demonstrate the functionality of NEXMD in detail, we analyze the photoexcited dynamics of a polyphenylene ethynylene dendrimer (PPE, C30H18) in vacuum and in a continuum solvent. Furthermore, the PPE molecule example serves to highlight the utility of the getexcited.py helper script to form a streamlined workflow. This script, provided with the package, can both set up NEXMD calculations and analyze the results, including, but not limited to, collecting populations, generating an average optical spectrum, and restarting unfinished calculations.

18.
Sci Data ; 7(1): 134, 2020 05 01.
Article in English | MEDLINE | ID: mdl-32358545

ABSTRACT

Maximum diversification of data is a central theme in building generalized and accurate machine learning (ML) models. In chemistry, ML has been used to develop models for predicting molecular properties, for example quantum mechanics (QM) calculated potential energy surfaces and atomic charge models. The ANI-1x and ANI-1ccx ML-based general-purpose potentials for organic molecules were developed through active learning; an automated data diversification process. Here, we describe the ANI-1x and ANI-1ccx data sets. To demonstrate data diversity, we visualize it with a dimensionality reduction scheme, and contrast against existing data sets. The ANI-1x data set contains multiple QM properties from 5 M density functional theory calculations, while the ANI-1ccx data set contains 500 k data points obtained with an accurate CCSD(T)/CBS extrapolation. Approximately 14 million CPU core-hours were expended to generate this data. Multiple QM calculated properties for the chemical elements C, H, N, and O are provided: energies, atomic forces, multipole moments, atomic charges, etc. We provide this data to the community to aid research and development of ML models for chemistry.

19.
Chem Rev ; 120(4): 2215-2287, 2020 02 26.
Article in English | MEDLINE | ID: mdl-32040312

ABSTRACT

Optically active molecular materials, such as organic conjugated polymers and biological systems, are characterized by strong coupling between electronic and vibrational degrees of freedom. Typically, simulations must go beyond the Born-Oppenheimer approximation to account for non-adiabatic coupling between excited states. Indeed, non-adiabatic dynamics is commonly associated with exciton dynamics and photophysics involving charge and energy transfer, as well as exciton dissociation and charge recombination. Understanding the photoinduced dynamics in such materials is vital to providing an accurate description of exciton formation, evolution, and decay. This interdisciplinary field has matured significantly over the past decades. Formulation of new theoretical frameworks, development of more efficient and accurate computational algorithms, and evolution of high-performance computer hardware has extended these simulations to very large molecular systems with hundreds of atoms, including numerous studies of organic semiconductors and biomolecules. In this Review, we will describe recent theoretical advances including treatment of electronic decoherence in surface-hopping methods, the role of solvent effects, trivial unavoided crossings, analysis of data based on transition densities, and efficient computational implementations of these numerical methods. We also emphasize newly developed semiclassical approaches, based on the Gaussian approximation, which retain phase and width information to account for significant decoherence and interference effects while maintaining the high efficiency of surface-hopping approaches. The above developments have been employed to successfully describe photophysics in a variety of molecular materials.

20.
J Chem Phys ; 151(8): 084313, 2019 Aug 28.
Article in English | MEDLINE | ID: mdl-31470719

ABSTRACT

Laser-induced fluorescence excitation and dispersed fluorescence spectra of a model flexible bichromophore, 1,1-diphenylethane (DPE), have been recorded under jet-cooled conditions in the gas phase in the region near the first pair of near-degenerate excited states (S1 and S2). The S1 and S2 origin transitions have been identified at 37 397 and 37 510 cm-1, a splitting of 113 cm-1. This splitting is four times smaller than the excitonic splitting calculated by ab initio methods at the EOM-CCSD/cc-pVDZ level of theory (410 cm-1), which necessarily relies on the Born-Oppenheimer approximation. Dispersed fluorescence spectra provide a state-to-state picture of the vibronic coupling. These results are compared with the results of a multimode vibronic coupling model capable of treating chromophores in asymmetric environments. This model was used to predict the splitting between S1 and S2 origins close to the experiment, reduced from its pure excitonic value by Franck-Condon quenching. Quantitative accuracy is achieved by the model, lending insight into the state-to-state mixing that occurs between individual S1 and S2 vibronic levels. The S2 origin is determined to be mixed with S1(v) levels by two mechanisms common to internal conversion in almost any setting; namely, (i) mixing involving near-degenerate levels with large vibrational quantum number changes that are not governed by Δv = 1 Herzberg-Teller (HT) selection rules, and (ii) mixing with levels with larger energy gaps that do follow these selection rules. In DPE, the asymmetric ring flapping vibrational mode R¯ dominates the HT coupling.

SELECTION OF CITATIONS
SEARCH DETAIL
...