Search | VHL Regional Portal

1.

A data science roadmap for open science organizations engaged in early-stage drug discovery.

Edfeldt, Kristina; Edwards, Aled M; Engkvist, Ola; Günther, Judith; Hartley, Matthew; Hulcoop, David G; Leach, Andrew R; Marsden, Brian D; Menge, Amelie; Misquitta, Leonie; Müller, Susanne; Owen, Dafydd R; Schütt, Kristof T; Skelton, Nicholas; Steffen, Andreas; Tropsha, Alexander; Vernet, Erik; Wang, Yanli; Wellnitz, James; Willson, Timothy M; Clevert, Djork-Arné; Haibe-Kains, Benjamin; Schiavone, Lovisa Holmberg; Schapira, Matthieu.

Nat Commun ; 15(1): 5640, 2024 Jul 05.

Article in English | MEDLINE | ID: mdl-38965235

ABSTRACT

The Structural Genomics Consortium is an international open science research organization with a focus on accelerating early-stage drug discovery, namely hit discovery and optimization. We, as many others, believe that artificial intelligence (AI) is poised to be a main accelerator in the field. The question is then how to best benefit from recent advances in AI and how to generate, format and disseminate data to enable future breakthroughs in AI-guided drug discovery. We present here the recommendations of a working group composed of experts from both the public and private sectors. Robust data management requires precise ontologies and standardized vocabulary while a centralized database architecture across laboratories facilitates data integration into high-value datasets. Lab automation and opening electronic lab notebooks to data mining push the boundaries of data sharing and data modeling. Important considerations for building robust machine-learning models include transparent and reproducible data processing, choosing the most relevant data representation, defining the right training and test sets, and estimating prediction uncertainty. Beyond data-sharing, cloud-based computing can be harnessed to build and disseminate machine-learning models. Important vectors of acceleration for hit and chemical probe discovery will be (1) the real-time integration of experimental data generation and modeling workflows within design-make-test-analyze (DMTA) cycles openly, and at scale and (2) the adoption of a mindset where data scientists and experimentalists work as a unified team, and where data science is incorporated into the experimental design.

Subject(s)

Data Science , Drug Discovery , Machine Learning , Drug Discovery/methods , Data Science/methods , Humans , Artificial Intelligence , Information Dissemination/methods , Data Mining/methods , Cloud Computing , Databases, Factual

2.

Automatic identification of chemical moieties.

Lederer, Jonas; Gastegger, Michael; Schütt, Kristof T; Kampffmeyer, Michael; Müller, Klaus-Robert; Unke, Oliver T.

Phys Chem Chem Phys ; 25(38): 26370-26379, 2023 Oct 04.

Article in English | MEDLINE | ID: mdl-37750554

ABSTRACT

In recent years, the prediction of quantum mechanical observables with machine learning methods has become increasingly popular. Message-passing neural networks (MPNNs) solve this task by constructing atomic representations, from which the properties of interest are predicted. Here, we introduce a method to automatically identify chemical moieties (molecular building blocks) from such representations, enabling a variety of applications beyond property prediction, which otherwise rely on expert knowledge. The required representation can either be provided by a pretrained MPNN, or be learned from scratch using only structural information. Beyond the data-driven design of molecular fingerprints, the versatility of our approach is demonstrated by enabling the selection of representative entries in chemical databases, the automatic construction of coarse-grained force fields, as well as the identification of reaction coordinates.

3.

SchNetPack 2.0: A neural network toolbox for atomistic machine learning.

Schütt, Kristof T; Hessmann, Stefaan S P; Gebauer, Niklas W A; Lederer, Jonas; Gastegger, Michael.

J Chem Phys ; 158(14): 144801, 2023 Apr 14.

Article in English | MEDLINE | ID: mdl-37061495

ABSTRACT

SchNetPack is a versatile neural network toolbox that addresses both the requirements of method development and the application of atomistic machine learning. Version 2.0 comes with an improved data pipeline, modules for equivariant neural networks, and a PyTorch implementation of molecular dynamics. An optional integration with PyTorch Lightning and the Hydra configuration framework powers a flexible command-line interface. This makes SchNetPack 2.0 easily extendable with a custom code and ready for complex training tasks, such as the generation of 3D molecular structures.

4.

Inverse design of 3d molecular structures with conditional generative neural networks.

Gebauer, Niklas W A; Gastegger, Michael; Hessmann, Stefaan S P; Müller, Klaus-Robert; Schütt, Kristof T.

Nat Commun ; 13(1): 973, 2022 02 21.

Article in English | MEDLINE | ID: mdl-35190542

ABSTRACT

The rational design of molecules with desired properties is a long-standing challenge in chemistry. Generative neural networks have emerged as a powerful approach to sample novel molecules from a learned distribution. Here, we propose a conditional generative neural network for 3d molecular structures with specified chemical and structural properties. This approach is agnostic to chemical bonding and enables targeted sampling of novel molecules from conditional distributions, even in domains where reference calculations are sparse. We demonstrate the utility of our method for inverse design by generating molecules with specified motifs or composition, discovering particularly stable molecules, and jointly targeting multiple electronic properties beyond the training regime.

5.

Higher-Order Explanations of Graph Neural Networks via Relevant Walks.

Schnake, Thomas; Eberle, Oliver; Lederer, Jonas; Nakajima, Shinichi; Schutt, Kristof T; Muller, Klaus-Robert; Montavon, Gregoire.

IEEE Trans Pattern Anal Mach Intell ; 44(11): 7581-7596, 2022 11.

Article in English | MEDLINE | ID: mdl-34559639

ABSTRACT

Graph Neural Networks (GNNs) are a popular approach for predicting graph structured data. As GNNs tightly entangle the input graph into the neural network structure, common explainable AI approaches are not applicable. To a large extent, GNNs have remained black-boxes for the user so far. In this paper, we show that GNNs can in fact be naturally explained using higher-order expansions, i.e., by identifying groups of edges that jointly contribute to the prediction. Practically, we find that such explanations can be extracted using a nested attribution scheme, where existing techniques such as layer-wise relevance propagation (LRP) can be applied at each step. The output is a collection of walks into the input graph that are relevant for the prediction. Our novel explanation method, which we denote by GNN-LRP, is applicable to a broad range of graph neural networks and lets us extract practically relevant insights on sentiment analysis of text data, structure-property relationships in quantum chemistry, and image classification.

Subject(s)

Algorithms , Neural Networks, Computer

6.

SpookyNet: Learning force fields with electronic degrees of freedom and nonlocal effects.

Unke, Oliver T; Chmiela, Stefan; Gastegger, Michael; Schütt, Kristof T; Sauceda, Huziel E; Müller, Klaus-Robert.

Nat Commun ; 12(1): 7273, 2021 Dec 14.

Article in English | MEDLINE | ID: mdl-34907176

ABSTRACT

Machine-learned force fields combine the accuracy of ab initio methods with the efficiency of conventional force fields. However, current machine-learned force fields typically ignore electronic degrees of freedom, such as the total charge or spin state, and assume chemical locality, which is problematic when molecules have inconsistent electronic states, or when nonlocal effects play a significant role. This work introduces SpookyNet, a deep neural network for constructing machine-learned force fields with explicit treatment of electronic degrees of freedom and nonlocality, modeled via self-attention in a transformer architecture. Chemically meaningful inductive biases and analytical corrections built into the network architecture allow it to properly model physical limits. SpookyNet improves upon the current state-of-the-art (or achieves similar performance) on popular quantum chemistry data sets. Notably, it is able to generalize across chemical and conformational space and can leverage the learned chemical insights, e.g. by predicting unknown spin states, thus helping to close a further important remaining gap for today's machine learning models in quantum chemistry.

7.

Machine learning of solvent effects on molecular spectra and reactions.

Gastegger, Michael; Schütt, Kristof T; Müller, Klaus-Robert.

Chem Sci ; 12(34): 11473-11483, 2021 Sep 01.

Article in English | MEDLINE | ID: mdl-34567501

ABSTRACT

Fast and accurate simulation of complex chemical systems in environments such as solutions is a long standing challenge in theoretical chemistry. In recent years, machine learning has extended the boundaries of quantum chemistry by providing highly accurate and efficient surrogate models of electronic structure theory, which previously have been out of reach for conventional approaches. Those models have long been restricted to closed molecular systems without accounting for environmental influences, such as external electric and magnetic fields or solvent effects. Here, we introduce the deep neural network FieldSchNet for modeling the interaction of molecules with arbitrary external fields. FieldSchNet offers access to a wealth of molecular response properties, enabling it to simulate a wide range of molecular spectra, such as infrared, Raman and nuclear magnetic resonance. Beyond that, it is able to describe implicit and explicit molecular environments, operating as a polarizable continuum model for solvation or in a quantum mechanics/molecular mechanics setup. We employ FieldSchNet to study the influence of solvent effects on molecular spectra and a Claisen rearrangement reaction. Based on these results, we use FieldSchNet to design an external environment capable of lowering the activation barrier of the rearrangement reaction significantly, demonstrating promising venues for inverse chemical design.

8.

Perspective on integrating machine learning into computational chemistry and materials science.

Westermayr, Julia; Gastegger, Michael; Schütt, Kristof T; Maurer, Reinhard J.

J Chem Phys ; 154(23): 230903, 2021 Jun 21.

Article in English | MEDLINE | ID: mdl-34241249

ABSTRACT

Machine learning (ML) methods are being used in almost every conceivable area of electronic structure theory and molecular simulation. In particular, ML has become firmly established in the construction of high-dimensional interatomic potentials. Not a day goes by without another proof of principle being published on how ML methods can represent and predict quantum mechanical properties-be they observable, such as molecular polarizabilities, or not, such as atomic charges. As ML is becoming pervasive in electronic structure theory and molecular simulation, we provide an overview of how atomistic computational modeling is being transformed by the incorporation of ML approaches. From the perspective of the practitioner in the field, we assess how common workflows to predict structure, dynamics, and spectroscopy are affected by ML. Finally, we discuss how a tighter and lasting integration of ML methods with computational chemistry and materials science can be achieved and what it will mean for research practice, software development, and postgraduate training.

9.

Machine Learning Force Fields.

Unke, Oliver T; Chmiela, Stefan; Sauceda, Huziel E; Gastegger, Michael; Poltavsky, Igor; Schütt, Kristof T; Tkatchenko, Alexandre; Müller, Klaus-Robert.

Chem Rev ; 121(16): 10142-10186, 2021 08 25.

Article in English | MEDLINE | ID: mdl-33705118

ABSTRACT

In recent years, the use of machine learning (ML) in computational chemistry has enabled numerous advances previously out of reach due to the computational complexity of traditional electronic-structure methods. One of the most promising applications is the construction of ML-based force fields (FFs), with the aim to narrow the gap between the accuracy of ab initio methods and the efficiency of classical FFs. The key idea is to learn the statistical relation between chemical structure and potential energy without relying on a preconceived notion of fixed chemical bonds or knowledge about the relevant interactions. Such universal ML approximations are in principle only limited by the quality and quantity of the reference data used to train them. This review gives an overview of applications of ML-FFs and the chemical insights that can be obtained from them. The core concepts underlying ML-FFs are described in detail, and a step-by-step guide for constructing and testing them from scratch is given. The text concludes with a discussion of the challenges that remain to be overcome by the next generation of ML-FFs.

10.

Autonomous robotic nanofabrication with reinforcement learning.

Leinen, Philipp; Esders, Malte; Schütt, Kristof T; Wagner, Christian; Müller, Klaus-Robert; Tautz, F Stefan.

Sci Adv ; 6(36)2020 Sep.

Article in English | MEDLINE | ID: mdl-32917594

ABSTRACT

The ability to handle single molecules as effectively as macroscopic building blocks would enable the construction of complex supramolecular structures inaccessible to self-assembly. The fundamental challenges obstructing this goal are the uncontrolled variability and poor observability of atomic-scale conformations. Here, we present a strategy to work around both obstacles and demonstrate autonomous robotic nanofabrication by manipulating single molecules. Our approach uses reinforcement learning (RL), which finds solution strategies even in the face of large uncertainty and sparse feedback. We demonstrate the potential of our RL approach by removing molecules autonomously with a scanning probe microscope from a supramolecular structure. Our RL agent reaches an excellent performance, enabling us to automate a task that previously had to be performed by a human. We anticipate that our work opens the way toward autonomous agents for the robotic construction of functional supramolecular structures with speed, precision, and perseverance beyond our current capabilities.

11.

Machine learning of accurate energy-conserving molecular force fields.

Chmiela, Stefan; Tkatchenko, Alexandre; Sauceda, Huziel E; Poltavsky, Igor; Schütt, Kristof T; Müller, Klaus-Robert.

Sci Adv ; 3(5): e1603015, 2017 May.

Article in English | MEDLINE | ID: mdl-28508076

ABSTRACT

Using conservation of energy-a fundamental property of closed classical and quantum mechanical systems-we develop an efficient gradient-domain machine learning (GDML) approach to construct accurate molecular force fields using a restricted number of samples from ab initio molecular dynamics (AIMD) trajectories. The GDML implementation is able to reproduce global potential energy surfaces of intermediate-sized molecules with an accuracy of 0.3 kcal mol-1 for energies and 1 kcal mol-1 ÅÌ-1 for atomic forces using only 1000 conformational geometries for training. We demonstrate this accuracy for AIMD trajectories of molecules, including benzene, toluene, naphthalene, ethanol, uracil, and aspirin. The challenge of constructing conservative force fields is accomplished in our work by learning in a Hilbert space of vector-valued functions that obey the law of energy conservation. The GDML approach enables quantitative molecular dynamics simulations for molecules at a fraction of cost of explicit AIMD calculations, thereby allowing the construction of efficient force fields with the accuracy and transferability of high-level ab initio methods.

12.

Quantum-chemical insights from deep tensor neural networks.

Schütt, Kristof T; Arbabzadah, Farhad; Chmiela, Stefan; Müller, Klaus R; Tkatchenko, Alexandre.

Nat Commun ; 8: 13890, 2017 01 09.

Article in English | MEDLINE | ID: mdl-28067221

ABSTRACT

Learning from data has led to paradigm shifts in a multitude of disciplines, including web, text and image search, speech recognition, as well as bioinformatics. Can machine learning enable similar breakthroughs in understanding quantum many-body systems? Here we develop an efficient deep learning approach that enables spatially and chemically resolved insights into quantum-mechanical observables of molecular systems. We unify concepts from many-body Hamiltonians with purpose-designed deep tensor neural networks, which leads to size-extensive and uniformly accurate (1 kcal mol-1) predictions in compositional and configurational chemical space for molecules of intermediate size. As an example of chemical relevance, the model reveals a classification of aromatic rings with respect to their stability. Further applications of our model for predicting atomic energies and local chemical potentials in molecules, reliable isomer energies, and molecules with peculiar electronic structure demonstrate the potential of machine learning for revealing insights into complex quantum-chemical systems.

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL