Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 237
Filter
Add more filters










Publication year range
1.
J Comput Chem ; 2024 May 02.
Article in English | MEDLINE | ID: mdl-38695412

ABSTRACT

The impact of targeted replacement of individual terms in empirical force fields is quantitatively assessed for pure water, dichloromethane (CH 2 $$ {}_2 $$ Cl 2 $$ {}_2 $$ ), and solvated K + $$ {}^{+} $$ and Cl - $$ {}^{-} $$ ions. For the electrostatic interactions, point charges (PCs) and machine learning (ML)-based minimally distributed charges (MDCM) fitted to the molecular electrostatic potential are evaluated together with electrostatics based on the Coulomb integral. The impact of explicitly including second-order terms is investigated by adding a fragment molecular orbital (FMO)-derived polarization energy to an existing force field, in this case CHARMM. It is demonstrated that anisotropic electrostatics reduce the RMSE for water (by 1.4 kcal/mol), CH 2 $$ {}_2 $$ Cl 2 $$ {}_2 $$ (by 0.8 kcal/mol) and for solvated Cl - $$ {}^{-} $$ clusters (by 0.4 kcal/mol). An additional polarization term can be neglected for CH 2 $$ {}_2 $$ Cl 2 $$ {}_2 $$ but further improves the models for pure water (by ∼ $$ \sim $$ 1.0 kcal/mol) and hydrated Cl - $$ {}^{-} $$ (by 0.4 kcal/mol), and is key for solvated K + $$ {}^{+} $$ , reducing the RMSE by 2.3 kcal/mol. A 12-6 Lennard-Jones functional form performs satisfactorily with PC and MDCM electrostatics, but is not appropriate for descriptions that account for the electrostatic penetration energy. The importance of many-body contributions is assessed by comparing a strictly 2-body approach with self-consistent reference data. Two-body interactions suffice for CH 2 $$ {}_2 $$ Cl 2 $$ {}_2 $$ whereas water and solvated K + $$ {}^{+} $$ and Cl - $$ {}^{-} $$ ions require explicit many-body corrections. Finally, a many-body-corrected dimer potential energy surface exceeds the accuracy attained using a conventional empirical force field, potentially reaching that of an FMO calculation. The present work systematically quantifies which terms improve the performance of an existing force field and what reference data to use for parametrizing these terms in a tractable fashion for ML fitting of pure and heterogeneous systems.

2.
Phys Chem Chem Phys ; 26(16): 12698-12708, 2024 Apr 24.
Article in English | MEDLINE | ID: mdl-38602285

ABSTRACT

The reaction dynamics of H2COO to form HCOOH and dioxirane as first steps for OH-elimination is quantitatively investigated. Using a machine learned potential energy surface (PES) at the CASPT2/aug-cc-pVTZ level of theory vibrational excitation along the CH-normal mode νCH with energies up to 40.0 kcal mol-1 (∼5νCH) leads almost exclusively to HCOOH which further decomposes into OH + HCO. Although the barrier to form dioxirane is only 21.4 kcal mol-1 the reaction probability to form dioxirane is two orders of magnitude lower if the CH-stretch mode is excited. Following the dioxirane-formation pathway is facile, however, if the COO-bend vibration is excited together with energies equivalent to ∼2νCH or ∼3νCOO. For OH-formation in the atmosphere the pathway through HCOOH is probably most relevant because the alternative pathways (through dioxirane or formic acid) involve several intermediates that can de-excite through collisions, relax via internal vibrational relaxation (IVR), or pass through loose and vulnerable transition states (formic acid). This work demonstrates how, by selectively exciting particular vibrational modes, it is possible to dial into desired reaction channels with a high degree of specificity.

3.
Sci Adv ; 10(9): eadi6462, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38427733

ABSTRACT

The structure and dynamics of a molecular system is governed by its potential energy surface (PES), representing the total energy as a function of the nuclear coordinates. Obtaining accurate potential energy surfaces is limited by the exponential scaling of Hilbert space, restricting quantitative predictions of experimental observables from first principles to small molecules with just a few electrons. Here, we present an explicitly physics-informed approach for improving and assessing the quality of families of PESs by modifying them through linear coordinate transformations based on experimental data. We demonstrate this "morphing" of the PES for the He - H2+ complex using recent comprehensive Feshbach resonance (FR) measurements for reference PESs at three different levels of quantum chemistry. In all cases, the positions and intensities of peaks in the energy distributions are improved. We find these observables to be mainly sensitive to the long-range part of the PES.

4.
J Phys Chem Lett ; 15(12): 3419-3424, 2024 Mar 28.
Article in English | MEDLINE | ID: mdl-38506827

ABSTRACT

The role of numerical accuracy in training and evaluating neural network-based potential energy surfaces is examined for different experimental observables. For observables that require third- and fourth-order derivatives of the potential energy with respect to Cartesian coordinates single-precision arithmetics as is typically used in ML-based approaches is insufficient and leads to roughness of the underlying PES as is explicitly demonstrated. Increasing the numerical accuracy to double-precision gives a smooth PES with higher-order derivatives that are numerically stable and yield meaningful anharmonic frequencies and tunneling splitting as is demonstrated for H2CO and malonaldehyde. For molecular dynamics simulations, which only require first-order derivatives, single-precision arithmetics appears to be sufficient, though.

5.
J Phys Chem Lett ; 15(1): 90-96, 2024 Jan 11.
Article in English | MEDLINE | ID: mdl-38147042

ABSTRACT

The decomposition dynamics of vibrationally excited syn-CH3CHOO to form vinoxy + hydroxyl (CH2CHO + OH) radicals or to recombine to form glycolaldehyde (CH2OHCHO) are characterized using statistically significant numbers of molecular dynamics simulations using a full-dimensional neural-network-based potential energy surface at the CASPT2 level of theory. The computed final OH-translational and rotational state distributions agree well with experiments and probe the still unknown O-O bond strength DeOO for which best values from 22 to 25 kcal/mol are found. OH-elimination rates are consistent with experiments and do not vary appreciably with DeOO due to the non-equilibrium nature of the process. In addition to the OH-elimination pathway, OH roaming is observed following O-O scission, which leads to glycolaldehyde formation on the picosecond time scale. Together with recent work involving the methyl-ethyl-substituted Criegee intermediate, we conclude that OH roaming is a general pathway to be included in molecular-level modeling of atmospheric processes. This work demonstrates that atomistic simulations with machine-learned energy functions provide a viable route for exploring the chemistry and reaction dynamics of atmospheric reactions.

6.
J Phys Chem A ; 127(42): 8834-8848, 2023 Oct 26.
Article in English | MEDLINE | ID: mdl-37843300

ABSTRACT

The dynamics of hyperthermal N(4S) + O2 collisions were investigated both experimentally and theoretically. Crossed molecular beams experiments were performed at an average center-of-mass (c.m.) collision energy of ⟨Ecoll⟩ = 77.5 kcal mol-1, with velocity- and angle-resolved product detection by a rotatable mass spectrometer detector. Nonreactive (N + O2) and reactive (NO + O) product channels were identified. In the c.m. reference frame, the nonreactively scattered N atoms and reactively scattered NO molecules were both directed into the forward direction with respect to the initial direction of the reagent N atoms. On average, more than 90% of the available energy (⟨Eavl⟩ = 77.5 kcal mol-1) was retained in translation of the nonreactive products (N + O2), whereas a much smaller fraction of the available energy for the reactive pathway (⟨Eavl⟩ = 109.5 kcal mol-1) went into translation of the NO + O products, and the distribution of translational energies for this channel was broad, indicating extensive internal excitation in the nascent NO molecules. The experimentally derived c.m. translational energy and angular distributions of the reactive products suggested at least two dynamical pathways to the formation of NO + O. Quasiclassical trajectory (QCT) calculations were performed with a collision energy of Ecoll = 77 kcal mol-1 using two sets of potential energy surfaces, denoted as PES-I and PES-II, and these theoretical results were compared to each other and to the experimental results. PES-I is a reproducing kernel Hilbert space (RKHS) representation of multireference configurational interaction (MRCI) energies, while PES-II is a many-body permutation invariant polynomial (MB-PIP) fit of complete active space second order perturbation (CASPT2) points. The theoretical investigations were both consistent with the experimental suggestion of two dynamical pathways to produce NO + O, where reactive collisions may proceed on the doublet (12A') and quartet (14A') surfaces. When analyzed with this theoretical insight, the experimental c.m. translational energy and angular distributions were in reasonably good agreement with those predicted by the QCT calculations, although minor differences were observed which are discussed. Theoretical translational energy and angular distributions for the nonreactive N + O2 products matched the experimental translational energy and angular distributions almost quantitatively. Finally, relative yields for the nonreactive and reactive scattering channels were determined from the experiment and from both theoretical methods, and all results are in reasonable agreement.

7.
Phys Chem Chem Phys ; 25(33): 22089-22102, 2023 Aug 23.
Article in English | MEDLINE | ID: mdl-37610422

ABSTRACT

Vibrational spectroscopy in supersonic jet expansions is a powerful tool to assess molecular aggregates in close to ideal conditions for the benchmarking of quantum chemical approaches. The low temperatures achieved as well as the absence of environment effects allow for a direct comparison between computed and experimental spectra. This provides potential benchmarking data which can be revisited to hone different computational techniques, and it allows for the critical analysis of procedures under the setting of a blind challenge. In the latter case, the final result is unknown to modellers, providing an unbiased testing opportunity for quantum chemical models. In this work, we present the spectroscopic and computational results for the first HyDRA blind challenge. The latter deals with the prediction of water donor stretching vibrations in monohydrates of organic molecules. This edition features a test set of 10 systems. Experimental water donor OH vibrational wavenumbers for the vacuum-isolated monohydrates of formaldehyde, tetrahydrofuran, pyridine, tetrahydrothiophene, trifluoroethanol, methyl lactate, dimethylimidazolidinone, cyclooctanone, trifluoroacetophenone and 1-phenylcyclohexane-cis-1,2-diol are provided. The results of the challenge show promising predictive properties in both purely quantum mechanical approaches as well as regression and other machine learning strategies.

8.
J Chem Phys ; 159(2)2023 Jul 14.
Article in English | MEDLINE | ID: mdl-37435940

ABSTRACT

Full-dimensional potential energy surfaces (PESs) based on machine learning (ML) techniques provide a means for accurate and efficient molecular simulations in the gas and condensed phase for various experimental observables ranging from spectroscopy to reaction dynamics. Here, the MLpot extension with PhysNet as the ML-based model for a PES is introduced into the newly developed pyCHARMM application programming interface. To illustrate the conception, validation, refining, and use of a typical workflow, para-chloro-phenol is considered as an example. The main focus is on how to approach a concrete problem from a practical perspective and applications to spectroscopic observables and the free energy for the -OH torsion in solution are discussed in detail. For the computed IR spectra in the fingerprint region, the computations for para-chloro-phenol in water are in good qualitative agreement with experiment carried out in CCl4. Moreover, relative intensities are largely consistent with experimental findings. The barrier for rotation of the -OH group increases from ∼3.5 kcal/mol in the gas phase to ∼4.1 kcal/mol from simulations in water due to favorable H-bonding interactions of the -OH group with surrounding water molecules.

9.
J Chem Phys ; 158(21)2023 Jun 07.
Article in English | MEDLINE | ID: mdl-37260004

ABSTRACT

The rise of machine learning has greatly influenced the field of computational chemistry and atomistic molecular dynamics simulations in particular. One of its most exciting prospects is the development of accurate, full-dimensional potential energy surfaces (PESs) for molecules and clusters, which, however, often require thousands to tens of thousands of ab initio data points restricting the community to medium sized molecules and/or lower levels of theory (e.g., density functional theory). Transfer learning, which improves a global PES from a lower to a higher level of theory, offers a data efficient alternative requiring only a fraction of the high-level data (on the order of 100 are found to be sufficient for malonaldehyde). This work demonstrates that even with Hartree-Fock theory and a double-zeta basis set as the lower level model, transfer learning yields coupled-cluster single double triple [CCSD(T)]-level quality for H-transfer barrier energies, harmonic frequencies, and H-transfer tunneling splittings. Most importantly, finite-temperature molecular dynamics simulations on the sub-µs time scale in the gas phase are possible and the infrared spectra determined from the transfer-learned PESs are in good agreement with the experiment. It is concluded that routine, long-time atomistic simulations on PESs fulfilling CCSD(T)-standards become possible.

10.
Phys Chem Chem Phys ; 25(20): 13933-13945, 2023 May 24.
Article in English | MEDLINE | ID: mdl-37190820

ABSTRACT

Recent advances in experimental methodology enabled studies of the quantum-state- and conformational dependence of chemical reactions under precisely controlled conditions in the gas phase. Here, we generated samples of selected gauche and s-trans 2,3-dibromobutadiene (DBB) by electrostatic deflection in a molecular beam and studied their reaction with Coulomb crystals of laser-cooled Ca+ ions in an ion trap. The rate coefficients for the total reaction were found to strongly depend on both the conformation of DBB and the electronic state of Ca+. In the (4p)2P1/2 and (3d)2D3/2 excited states of Ca+, the reaction is capture-limited and faster for the gauche conformer due to long-range ion-dipole interactions. In the (4s)2S1/2 ground state of Ca+, the reaction rate for s-trans DBB still conforms with the capture limit, while that for gauche DBB is strongly suppressed. The experimental observations were analysed with the help of adiabatic capture theory, ab initio calculations and reactive molecular dynamics simulations on a machine-learned full-dimensional potential energy surface of the system. The theory yields near-quantitative agreement for s-trans-DBB, but overestimates the reactivity of the gauche-conformer compared to the experiment. The present study points to the important role of molecular geometry even in strongly reactive exothermic systems and illustrates striking differences in the reactivity of individual conformers in gas-phase ion-molecule reactions.

11.
Phys Chem Chem Phys ; 25(20): 13854-13863, 2023 May 24.
Article in English | MEDLINE | ID: mdl-37165792

ABSTRACT

The reaction N(4S) + NO(X2Π) → O(3P) + N2(X1Σ+g) plays a pivotal role in the conversion of atomic to molecular nitrogen in dense interstellar clouds and in the atmosphere. Here we report a joint experimental and computational investigation of the N + NO reaction with the aim of providing improved constraints on its low temperature reactivity. Thermal rates were measured over the 50 to 296 K range in a continuous supersonic flow reactor coupled with pulsed laser photolysis and laser induced fluorescence for the production and detection of N(4S) atoms, respectively. With decreasing temperature, the experimentally measured reaction rate was found to monotonously increase up to a value of (6.6 ± 1.3) × 10-11 cm3 s-1 at 50 K. To confirm this finding, quasi-classical trajectory simulations were carried out on a previously validated, full-dimensional potential energy surface (PES). However, around 50 K the computed rates decreased which required re-evaluation of the reactive PES in the long-range part due to a small spurious barrier with a height of ∼40 K in the entrance channel. By exploring different correction schemes the measured thermal rates can be adequately reproduced, displaying a clear negative temperature dependence over the entire temperature range. The possible astrochemical implications of an increased reaction rate at low temperature are also discussed.

12.
Science ; 380(6640): 77-81, 2023 04 07.
Article in English | MEDLINE | ID: mdl-37023184

ABSTRACT

Feshbach resonances are fundamental to interparticle interactions and become particularly important in cold collisions with atoms, ions, and molecules. In this work, we present the detection of Feshbach resonances in a benchmark system for strongly interacting and highly anisotropic collisions: molecular hydrogen ions colliding with noble gas atoms. The collisions are launched by cold Penning ionization, which exclusively populates Feshbach resonances that span both short- and long-range parts of the interaction potential. We resolved all final molecular channels in a tomographic manner using ion-electron coincidence detection. We demonstrate the nonstatistical nature of the final-state distribution. By performing quantum scattering calculations on ab initio potential energy surfaces, we show that the isolation of the Feshbach resonance pathways reveals their distinctive fingerprints in the collision outcome.

13.
J Chem Phys ; 158(12): 125103, 2023 Mar 28.
Article in English | MEDLINE | ID: mdl-37003761

ABSTRACT

The transport of ligands, such as NO or O2, through internal cavities is essential for the function of globular proteins, including hemoglobin, myoglobin (Mb), neuroglobin, truncated hemoglobins, or cytoglobin. For Mb, several internal cavities (Xe1 through Xe4) were observed experimentally and they were linked to ligand storage. The present work determines barriers for xenon diffusion and relative stabilization energies for the ligand in the initial and final pocket, linking a transition depending on the occupancy state of the remaining pockets from both biased and unbiased molecular dynamics simulations. It is found that the energetics of a particular ligand migration pathway may depend on the direction in which the transition is followed and the occupancy state of the other cavities. Furthermore, the barrier height for a particular transition can depend in a non-additive fashion on the occupancy of either cavity A or B or simultaneous population of both cavities, A and B. Multiple repeats for the Xe1 → Xe2 transition reveal that the activation barrier is a distribution of barrier heights rather than one single value, which is confirmed by a distribution of transition times for the same transition from unbiased simulations. Dynamic cross correlation maps demonstrate that correlated motions occur between adjacent residues or through space, residue Phe138 is found to be a gate for the Xe1 → Xe2 transition, and the volumes of the internal cavities vary along the diffusion pathway, indicating that there is dynamic communication between the ligand and the protein. These findings suggest that Mb is an allosteric protein.


Subject(s)
Myoglobin , Xenon , Myoglobin/chemistry , Ligands , Hemoglobins/chemistry , Molecular Dynamics Simulation , Carbon Monoxide/chemistry , Protein Conformation , Binding Sites
14.
J Chem Phys ; 158(14): 144302, 2023 Apr 14.
Article in English | MEDLINE | ID: mdl-37061478

ABSTRACT

The transition between the gas-, supercritical-, and liquid-phase behavior is a fascinating topic, which still lacks molecular-level understanding. Recent ultrafast two-dimensional infrared spectroscopy experiments suggested that the vibrational spectroscopy of N2O embedded in xenon and SF6 as solvents provides an avenue to characterize the transitions between different phases as the concentration (or density) of the solvent increases. The present work demonstrates that classical molecular dynamics (MD) simulations together with accurate interaction potentials allows us to (semi-)quantitatively describe the transition in rotational vibrational infrared spectra from the P-/R-branch line shape for the stretch vibrations of N2O at low solvent densities to the Q-branch-like line shapes at high densities. The results are interpreted within the classical theory of rigid-body rotation in more/less constraining environments at high/low solvent densities or based on phenomenological models for the orientational relaxation of rotational motion. It is concluded that classical MD simulations provide a powerful approach to characterize and interpret the ultrafast motion of solutes in low to high density solvents at a molecular level.

15.
Digit Discov ; 2(1): 28-58, 2023 Feb 13.
Article in English | MEDLINE | ID: mdl-36798879

ABSTRACT

Artificial Neural Networks (NN) are already heavily involved in methods and applications for frequent tasks in the field of computational chemistry such as representation of potential energy surfaces (PES) and spectroscopic predictions. This perspective provides an overview of the foundations of neural network-based full-dimensional potential energy surfaces, their architectures, underlying concepts, their representation and applications to chemical systems. Methods for data generation and training procedures for PES construction are discussed and means for error assessment and refinement through transfer learning are presented. A selection of recent results illustrates the latest improvements regarding accuracy of PES representations and system size limitations in dynamics simulations, but also NN application enabling direct prediction of physical results without dynamics simulations. The aim is to provide an overview for the current state-of-the-art NN approaches in computational chemistry and also to point out the current challenges in enhancing reliability and applicability of NN methods on a larger scale.

16.
J Phys Chem B ; 127(7): 1526-1539, 2023 02 23.
Article in English | MEDLINE | ID: mdl-36757772

ABSTRACT

S-nitrosylation, the covalent addition of NO to the thiol side chain of cysteine, is an important post-transitional modification (PTM) that can affect the function of proteins. As such, PTMs extend and diversify protein function and thus characterizing consequences of PTM at a molecular level is of great interest. Although PTMs can be detected through various direct/indirect methods, they lack the capability to investigate the modifications with molecular detail. In the present work local and global structural dynamics, their correlation, the hydration structure, and the infrared spectroscopy for WT and S-nitrosylated Kirsten rat sarcoma virus (K-RAS) and hemoglobin (Hb) are characterized from molecular dynamics simulations. It is found that attaching NO to Cys118 in K-RAS rigidifies the protein in the Switch-I region which has functional implications, whereas for Hb, nitrosylation at Cys93 at the ß1 chain increases the flexibility of secondary structural motives for Hb in its T0 and R4 conformational substates. Solvent water access decreased by 40% after nitrosylation in K-RAS, similar to Hb for which, however, local hydration of the R4SNO state is yet lower than for T0SNO. Finally, S-nitrosylation leads to detectable peaks for the NO stretch frequency, but the congested IR spectral region will make experimental detection of these bands difficult. Overall, S-nitrosylation in these two proteins is found to influence hydration, protein flexibility, and conformational dynamics which are all eventually involved in protein regulation and function at a molecular level.


Subject(s)
Hemoglobins , Proto-Oncogene Proteins p21(ras) , Proto-Oncogene Proteins p21(ras)/metabolism , Hemoglobins/chemistry , Sulfhydryl Compounds , Cysteine/chemistry , Nitric Oxide/metabolism
17.
J Chem Phys ; 158(2): 025101, 2023 Jan 14.
Article in English | MEDLINE | ID: mdl-36641390

ABSTRACT

The local hydration around tetrameric hemoglobin (Hb) in its T0 and R4 conformational substates is analyzed based on molecular dynamics simulations. Analysis of the local hydrophobicity (LH) for all residues at the α1ß2 and α2ß1 interfaces, responsible for the quaternary T → R transition, which is encoded in the Monod-Wyman-Changeux model, as well as comparison with earlier computations of the solvent accessible surface area, makes clear that the two quantities measure different aspects of hydration. Local hydrophobicity quantifies the presence and structure of water molecules at the interface, whereas "buried surface" reports on the available space for solvent. For simulations with Hb frozen in its T0 and R4 states, the correlation coefficient between LH and buried surface is 0.36 and 0.44, respectively, but it increases considerably if the 95% confidence interval is used. The LH with Hb frozen and flexible changes little for most residues at the interfaces but is significantly altered for a few select ones: Thr41α, Tyr42α, Tyr140α, Trp37ß, Glu101ß (for T0) and Thr38α, Tyr42α, Tyr140α (for R4). The number of water molecules at the interface is found to increase by ∼25% for T0 → R4, which is consistent with earlier measurements. Since hydration is found to be essential to protein function, it is clear that hydration also plays an essential role in allostery.


Subject(s)
Hemoglobins , Water , Water/chemistry , Hemoglobins/chemistry , Solvents , Hydrophobic and Hydrophilic Interactions , Chemical Phenomena
18.
J Chem Theory Comput ; 18(12): 7544-7554, 2022 Dec 13.
Article in English | MEDLINE | ID: mdl-36346403

ABSTRACT

Accounting for geometry-induced changes in the electronic distribution in molecular simulation is important for capturing effects such as charge flow, charge anisotropy, and polarization. Multipolar force fields have demonstrated their ability to correctly represent chemically significant features such as anisotropy and sigma holes. It has also been shown that off-center point charges offer a compact alternative with similar accuracy. Here, it is demonstrated that allowing relocation of charges within a minimally distributed charge model (MDCM) with respect to their reference atoms is a viable route to capture changes in the molecular charge distribution depending on geometry, i.e., intramolecular polarization. The approach, referred to as "flexible MDCM" (fMDCM), is validated on a number of small molecules and provides accuracies in the electrostatic potential (ESP) of 0.5 kcal/mol on average compared with reference data from electronic structure calculations, whereas MDCM and point charges have root mean squared errors of a factor of 2 to 5 higher. In addition, MD simulations in the NVE ensemble using fMDCM for a box of flexible water molecules with periodic boundary conditions show a width of 0.1 kcal/mol for the fluctuation around the mean at 300 K on the 10 ns time scale. For water, the equilibrium valence angle in the gas phase is found to increase by 2° for simulations in the condensed phase which is consistent with experiment. The accuracy in capturing the geometry dependence of the ESP together with the long-time stability in energy conserving simulations makes fMDCM a promising tool to introduce advanced electrostatics into atomistic simulations.


Subject(s)
Molecular Dynamics Simulation , Water , Static Electricity , Water/chemistry , Anisotropy
19.
Chem Sci ; 13(44): 13068-13084, 2022 Nov 16.
Article in English | MEDLINE | ID: mdl-36425481

ABSTRACT

The value of uncertainty quantification on predictions for trained neural networks (NNs) on quantum chemical reference data is quantitatively explored. For this, the architecture of the PhysNet NN was suitably modified and the resulting model (PhysNet-DER) was evaluated with different metrics to quantify its calibration, the quality of its predictions, and whether prediction error and the predicted uncertainty can be correlated. Training on the QM9 database and evaluating data in the test set within and outside the distribution indicate that error and uncertainty are not linearly related. However, the observed variance provides insight into the quality of the data used for training. Additionally, the influence of the chemical space covered by the training data set was studied by using a biased database. The results clarify that noise and redundancy complicate property prediction for molecules even in cases for which changes - such as double bond migration in two otherwise identical molecules - are small. The model was also applied to a real database of tautomerization reactions. Analysis of the distance between members in feature space in combination with other parameters shows that redundant information in the training dataset can lead to large variances and small errors whereas the presence of similar but unspecific information returns large errors but small variances. This was, e.g., observed for nitro-containing aliphatic chains for which predictions were difficult although the training set contained several examples for nitro groups bound to aromatic molecules. The finding underlines the importance of the composition of the training data and provides chemical insight into how this affects the prediction capabilities of a ML model. Finally, the presented method can be used for information-based improvement of chemical databases for target applications through active learning optimization.

20.
Phys Chem Chem Phys ; 24(42): 26046-26060, 2022 Nov 02.
Article in English | MEDLINE | ID: mdl-36268728

ABSTRACT

Halogenated groups are relevant in pharmaceutical applications and potentially useful spectroscopic probes for infrared spectroscopy. In this work, the structural dynamics and infrared spectroscopy of para-fluorophenol (F-PhOH) and phenol (PhOH) is investigated in the gas phase and in water using a combination of experiment and molecular dynamics (MD) simulations. The gas phase and solvent dynamics around F-PhOH and PhOH is characterized from atomistic simulations using empirical energy functions with point charges or multipoles for the electrostatics, Machine Learning (ML) based parametrizations and with full ab initio (QM) and mixed Quantum Mechanical/Molecular Mechanics (QM/MM) simulations with a particular focus on the CF- and OH-stretch region. The CF-stretch band is heavily mixed with other modes whereas the OH-stretch in solution displays a characteristic high-frequency peak around 3600 cm-1 most likely associated with the -OH group of PhOH and F-PhOH together with a characteristic progression below 3000 cm-1 due to coupling with water modes which is also reproduced by several of the simulations. Solvent and radial distribution functions indicate that the CF-site is largely hydrophobic except for simulations using point charges which renders them unsuited for correctly describing hydration and dynamics around fluorinated sites. The hydrophobic character of the CF-group is particularly relevant for applications in pharmaceutical chemistry with a focus on local hydration and interaction with the surrounding protein.


Subject(s)
Phenols , Quantum Theory , Spectrophotometry, Infrared/methods , Water/chemistry , Solvents/chemistry , Phenol/chemistry
SELECTION OF CITATIONS
SEARCH DETAIL
...