Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
Add more filters










Database
Language
Publication year range
1.
Sci Adv ; 10(9): eadi6462, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38427733

ABSTRACT

The structure and dynamics of a molecular system is governed by its potential energy surface (PES), representing the total energy as a function of the nuclear coordinates. Obtaining accurate potential energy surfaces is limited by the exponential scaling of Hilbert space, restricting quantitative predictions of experimental observables from first principles to small molecules with just a few electrons. Here, we present an explicitly physics-informed approach for improving and assessing the quality of families of PESs by modifying them through linear coordinate transformations based on experimental data. We demonstrate this "morphing" of the PES for the He - H2+ complex using recent comprehensive Feshbach resonance (FR) measurements for reference PESs at three different levels of quantum chemistry. In all cases, the positions and intensities of peaks in the energy distributions are improved. We find these observables to be mainly sensitive to the long-range part of the PES.

2.
J Chem Phys ; 159(2)2023 Jul 14.
Article in English | MEDLINE | ID: mdl-37435940

ABSTRACT

Full-dimensional potential energy surfaces (PESs) based on machine learning (ML) techniques provide a means for accurate and efficient molecular simulations in the gas and condensed phase for various experimental observables ranging from spectroscopy to reaction dynamics. Here, the MLpot extension with PhysNet as the ML-based model for a PES is introduced into the newly developed pyCHARMM application programming interface. To illustrate the conception, validation, refining, and use of a typical workflow, para-chloro-phenol is considered as an example. The main focus is on how to approach a concrete problem from a practical perspective and applications to spectroscopic observables and the free energy for the -OH torsion in solution are discussed in detail. For the computed IR spectra in the fingerprint region, the computations for para-chloro-phenol in water are in good qualitative agreement with experiment carried out in CCl4. Moreover, relative intensities are largely consistent with experimental findings. The barrier for rotation of the -OH group increases from ∼3.5 kcal/mol in the gas phase to ∼4.1 kcal/mol from simulations in water due to favorable H-bonding interactions of the -OH group with surrounding water molecules.

3.
Digit Discov ; 2(1): 28-58, 2023 Feb 13.
Article in English | MEDLINE | ID: mdl-36798879

ABSTRACT

Artificial Neural Networks (NN) are already heavily involved in methods and applications for frequent tasks in the field of computational chemistry such as representation of potential energy surfaces (PES) and spectroscopic predictions. This perspective provides an overview of the foundations of neural network-based full-dimensional potential energy surfaces, their architectures, underlying concepts, their representation and applications to chemical systems. Methods for data generation and training procedures for PES construction are discussed and means for error assessment and refinement through transfer learning are presented. A selection of recent results illustrates the latest improvements regarding accuracy of PES representations and system size limitations in dynamics simulations, but also NN application enabling direct prediction of physical results without dynamics simulations. The aim is to provide an overview for the current state-of-the-art NN approaches in computational chemistry and also to point out the current challenges in enhancing reliability and applicability of NN methods on a larger scale.

4.
Chem Sci ; 13(44): 13068-13084, 2022 Nov 16.
Article in English | MEDLINE | ID: mdl-36425481

ABSTRACT

The value of uncertainty quantification on predictions for trained neural networks (NNs) on quantum chemical reference data is quantitatively explored. For this, the architecture of the PhysNet NN was suitably modified and the resulting model (PhysNet-DER) was evaluated with different metrics to quantify its calibration, the quality of its predictions, and whether prediction error and the predicted uncertainty can be correlated. Training on the QM9 database and evaluating data in the test set within and outside the distribution indicate that error and uncertainty are not linearly related. However, the observed variance provides insight into the quality of the data used for training. Additionally, the influence of the chemical space covered by the training data set was studied by using a biased database. The results clarify that noise and redundancy complicate property prediction for molecules even in cases for which changes - such as double bond migration in two otherwise identical molecules - are small. The model was also applied to a real database of tautomerization reactions. Analysis of the distance between members in feature space in combination with other parameters shows that redundant information in the training dataset can lead to large variances and small errors whereas the presence of similar but unspecific information returns large errors but small variances. This was, e.g., observed for nitro-containing aliphatic chains for which predictions were difficult although the training set contained several examples for nitro groups bound to aromatic molecules. The finding underlines the importance of the composition of the training data and provides chemical insight into how this affects the prediction capabilities of a ML model. Finally, the presented method can be used for information-based improvement of chemical databases for target applications through active learning optimization.

5.
J Am Chem Soc ; 144(31): 14170-14180, 2022 08 10.
Article in English | MEDLINE | ID: mdl-35895323

ABSTRACT

The spectroscopy and structural dynamics of a deep eutectic mixture (KSCN/acetamide) with varying water content is investigated from 2D IR (with the C-N stretch vibration of the SCN- anions as the reporter) and THz spectroscopy. Molecular dynamics simulations correctly describe the nontrivial dependence of both spectroscopic signatures depending on water content. For the 2D IR spectra, the MD simulations relate the steep increase in the cross-relaxation rate at high water content to the parallel alignment of packed SCN- anions. Conversely, the nonlinear increase of the THz absorption with increasing water content is mainly attributed to the formation of larger water clusters. The results demonstrate that a combination of structure-sensitive spectroscopies and molecular dynamics simulations provides molecular-level insights into the emergence of heterogeneity of such mixtures by modulating their composition.


Subject(s)
Deep Eutectic Solvents , Water , Molecular Dynamics Simulation , Solvents/chemistry , Spectrophotometry, Infrared , Vibration , Water/chemistry
6.
J Chem Theory Comput ; 17(8): 4769-4785, 2021 Aug 10.
Article in English | MEDLINE | ID: mdl-34288675

ABSTRACT

An essential aspect for adequate predictions of chemical properties by machine learning models is the database used for training them. However, studies that analyze how the content and structure of the databases used for training impact the prediction quality are scarce. In this work, we analyze and quantify the relationships learned by a machine learning model (Neural Network) trained on five different reference databases (QM9, PC9, ANI-1E, ANI-1, and ANI-1x) to predict tautomerization energies from molecules in Tautobase. For this, characteristics such as the number of heavy atoms in a molecule, number of atoms of a given element, bond composition, or initial geometry on the quality of the predictions are considered. The results indicate that training on a chemically diverse database is crucial for obtaining good results and also that conformational sampling can partly compensate for limited coverage of chemical diversity. The overall best-performing reference database (ANI-1x) performs on average by 1 kcal/mol better than PC9, which, however, contains about 2 orders of magnitude fewer reference structures. On the other hand, PC9 is chemically more diverse by a factor of ∼5 as quantified by the number of atom-in-molecule-based fragments (amons) it contains compared with the ANI family of databases. A quantitative measure for deficiencies is the Kullback-Leibler divergence between reference and target distributions. It is explicitly demonstrated that when certain types of bonds need to be covered in the target database (Tautobase) but are undersampled in the reference databases, the resulting predictions are poor. Examples of this include the poor performance of all databases analyzed to predict C(sp2)-C(sp2) double bonds close to heteroatoms and azoles containing N-N and N-O bonds. Analysis of the results with a Tree MAP algorithm provides deeper understanding of specific deficiencies in predicting tautomerization energies by the reference datasets due to inadequate coverage of chemical space. Capitalizing on this information can be used to either improve existing databases or generate new databases of sufficient diversity for a range of machine learning (ML) applications in chemistry.

SELECTION OF CITATIONS
SEARCH DETAIL
...