Your browser doesn't support javascript.
An open-source framework for fast-yet-accurate calculation of quantum mechanical features.
Caldeweyher, Eike; Bauer, Christoph; Tehrani, Ali Soltani.
  • Caldeweyher E; Data Science and Modelling, Pharmaceutical Sciences, R & D, AstraZeneca, Gothenburg, Sweden. eike.caldeweyher@astrazeneca.com.
  • Bauer C; Data Science and Modelling, Pharmaceutical Sciences, R & D, AstraZeneca, Gothenburg, Sweden. eike.caldeweyher@astrazeneca.com.
  • Tehrani AS; Data Science and Modelling, Pharmaceutical Sciences, R & D, AstraZeneca, Gothenburg, Sweden. eike.caldeweyher@astrazeneca.com.
Phys Chem Chem Phys ; 24(17): 10599-10610, 2022 May 04.
Article in English | MEDLINE | ID: covidwho-1805671
ABSTRACT
We present the open-source framework kallisto that enables the efficient and robust calculation of quantum mechanical features for atoms and molecules. For a benchmark set of 49 experimental molecular polarizabilities, the predictive power of the presented method competes against second-order perturbation theory in a converged atomic-orbital basis set at a fraction of its computational costs. The calculation of isotropic molecular polarizabilities is robust for a data set of more than 80 000 molecules. We present furthermore a generally applicable van der Waals radius model that is rooted on atomic static polarizabilites. Efficiency tests show that such radii can even be calculated for small- to medium-size proteins where the largest system (SARS-CoV-2 spike protein) has 42 539 atoms. Following the work of Domingo-Alemenara et al. [Domingo-Alemenara et al., Nat. Commun., 2019, 10, 5811], we present computational predictions for retention times for different chromatographic methods and describe how physicochemical features improve the predictive power of machine-learning models that otherwise only rely on two-dimensional features like molecular fingerprints. Additionally, we developed an internal benchmark set of experimental super-critical fluid chromatography retention times. For those methods, improvements of up to 10.6% are obtained when combining molecular fingerprints with physicochemical descriptors. Shapley additive explanation values show furthermore that the physical nature of the applied features can be retained within the final machine-learning models. We generally recommend the kallisto framework as a robust, low-cost, and physically motivated featurizer for upcoming state-of-the-art machine-learning studies.
Subject(s)

Full text: Available Collection: International databases Database: MEDLINE Main subject: COVID-19 Type of study: Prognostic study Limits: Humans Language: English Journal: Phys Chem Chem Phys Journal subject: Biophysics / Chemistry Year: 2022 Document Type: Article Affiliation country: D2cp01165d

Similar

MEDLINE

...
LILACS

LIS


Full text: Available Collection: International databases Database: MEDLINE Main subject: COVID-19 Type of study: Prognostic study Limits: Humans Language: English Journal: Phys Chem Chem Phys Journal subject: Biophysics / Chemistry Year: 2022 Document Type: Article Affiliation country: D2cp01165d