Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
Add more filters










Database
Language
Publication year range
1.
Protein Sci ; 32(1): e4516, 2023 01.
Article in English | MEDLINE | ID: mdl-36403089

ABSTRACT

The ability to design customized proteins to perform specific tasks is of great interest. We are particularly interested in the design of sensitive and specific small molecule ligand-binding proteins for biotechnological or biomedical applications. Computational methods can narrow down the immense combinatorial space to find the best solution and thus provide starting points for experimental procedures. However, success rates strongly depend on accurate modeling and energetic evaluation. Not only intra- but also intermolecular interactions have to be considered. To address this problem, we developed PocketOptimizer, a modular computational protein design pipeline, that predicts mutations in the binding pockets of proteins to increase affinity for a specific ligand. Its modularity enables users to compare different combinations of force fields, rotamer libraries, and scoring functions. Here, we present a much-improved version--PocketOptimizer 2.0. We implemented a cleaner user interface, an extended architecture with more supported tools, such as force fields and scoring functions, a backbone-dependent rotamer library, as well as different improvements in the underlying algorithms. Version 2.0 was tested against a benchmark of design cases and assessed in comparison to the first version. Our results show how newly implemented features such as the new rotamer library can lead to improved prediction accuracy. Therefore, we believe that PocketOptimizer 2.0, with its many new and improved functionalities, provides a robust and versatile environment for the design of small molecule-binding pockets in proteins. It is widely applicable and extendible due to its modular framework. PocketOptimizer 2.0 can be downloaded at https://github.com/Hoecker-Lab/pocketoptimizer.


Subject(s)
Algorithms , Proteins , Ligands , Proteins/chemistry , Computer-Aided Design , Computers
2.
Patterns (N Y) ; 3(10): 100588, 2022 Oct 14.
Article in English | MEDLINE | ID: mdl-36277819

ABSTRACT

Artificial intelligence (AI) and machine learning (ML) are expanding in popularity for broad applications to challenging tasks in chemistry and materials science. Examples include the prediction of properties, the discovery of new reaction pathways, or the design of new molecules. The machine needs to read and write fluently in a chemical language for each of these tasks. Strings are a common tool to represent molecular graphs, and the most popular molecular string representation, Smiles, has powered cheminformatics since the late 1980s. However, in the context of AI and ML in chemistry, Smiles has several shortcomings-most pertinently, most combinations of symbols lead to invalid results with no valid chemical interpretation. To overcome this issue, a new language for molecules was introduced in 2020 that guarantees 100% robustness: SELF-referencing embedded string (Selfies). Selfies has since simplified and enabled numerous new applications in chemistry. In this perspective, we look to the future and discuss molecular string representations, along with their respective opportunities and challenges. We propose 16 concrete future projects for robust molecular representations. These involve the extension toward new chemical domains, exciting questions at the interface of AI and robust languages, and interpretability for both humans and machines. We hope that these proposals will inspire several follow-up works exploiting the full potential of molecular string representations for the future of AI in chemistry and materials science.

3.
J Chem Phys ; 157(2): 024303, 2022 Jul 14.
Article in English | MEDLINE | ID: mdl-35840379

ABSTRACT

Equilibrium structures determine material properties and biochemical functions. We here propose to machine learn phase space averages, conventionally obtained by ab initio or force-field-based molecular dynamics (MD) or Monte Carlo (MC) simulations. In analogy to ab initio MD, our ab initio machine learning (AIML) model does not require bond topologies and, therefore, enables a general machine learning pathway to obtain ensemble properties throughout the chemical compound space. We demonstrate AIML for predicting Boltzmann averaged structures after training on hundreds of MD trajectories. The AIML output is subsequently used to train machine learning models of free energies of solvation using experimental data and to reach competitive prediction errors (mean absolute error ∼ 0.8 kcal/mol) for out-of-sample molecules-within milliseconds. As such, AIML effectively bypasses the need for MD or MC-based phase space sampling, enabling exploration campaigns of Boltzmann averages throughout the chemical compound space at a much accelerated pace. We contextualize our findings by comparison to state-of-the-art methods resulting in a Pareto plot for the free energy of solvation predictions in terms of accuracy and time.


Subject(s)
Machine Learning , Molecular Dynamics Simulation , Monte Carlo Method
4.
Nat Commun ; 12(1): 4468, 2021 Jul 22.
Article in English | MEDLINE | ID: mdl-34294693

ABSTRACT

The computational prediction of atomistic structure is a long-standing problem in physics, chemistry, materials, and biology. Conventionally, force-fields or ab initio methods determine structure through energy minimization, which is either approximate or computationally demanding. This accuracy/cost trade-off prohibits the generation of synthetic big data sets accounting for chemical space with atomistic detail. Exploiting implicit correlations among relaxed structures in training data sets, our machine learning model Graph-To-Structure (G2S) generalizes across compound space in order to infer interatomic distances for out-of-sample compounds, effectively enabling the direct reconstruction of coordinates, and thereby bypassing the conventional energy optimization task. The numerical evidence collected includes 3D coordinate predictions for organic molecules, transition states, and crystalline solids. G2S improves systematically with training set size, reaching mean absolute interatomic distance prediction errors of less than 0.2 Å for less than eight thousand training structures - on par or better than conventional structure generators. Applicability tests of G2S include successful predictions for systems which typically require manual intervention, improved initial guesses for subsequent conventional ab initio based relaxation, and input generation for subsequent use of structure based quantum machine learning models.

5.
J Chem Phys ; 153(19): 194101, 2020 Nov 21.
Article in English | MEDLINE | ID: mdl-33218238

ABSTRACT

Coarse graining enables the investigation of molecular dynamics for larger systems and at longer timescales than is possible at an atomic resolution. However, a coarse graining model must be formulated such that the conclusions we draw from it are consistent with the conclusions we would draw from a model at a finer level of detail. It has been proved that a force matching scheme defines a thermodynamically consistent coarse-grained model for an atomistic system in the variational limit. Wang et al. [ACS Cent. Sci. 5, 755 (2019)] demonstrated that the existence of such a variational limit enables the use of a supervised machine learning framework to generate a coarse-grained force field, which can then be used for simulation in the coarse-grained space. Their framework, however, requires the manual input of molecular features to machine learn the force field. In the present contribution, we build upon the advance of Wang et al. and introduce a hybrid architecture for the machine learning of coarse-grained force fields that learn their own features via a subnetwork that leverages continuous filter convolutions on a graph neural network architecture. We demonstrate that this framework succeeds at reproducing the thermodynamics for small biomolecular systems. Since the learned molecular representations are inherently transferable, the architecture presented here sets the stage for the development of machine-learned, coarse-grained force fields that are transferable across molecular systems.

6.
J Mol Biol ; 432(13): 3898-3914, 2020 06 12.
Article in English | MEDLINE | ID: mdl-32330481

ABSTRACT

Natural evolution has generated an impressively diverse protein universe via duplication and recombination from a set of protein fragments that served as building blocks. The application of these concepts to the design of new proteins using subdomain-sized fragments from different folds has proven to be experimentally successful. To better understand how evolution has shaped our protein universe, we performed an all-against-all comparison of protein domains representing all naturally existing folds and identified conserved homologous protein fragments. Overall, we found more than 1000 protein fragments of various lengths among different folds through similarity network analysis. These fragments are present in very different protein environments and represent versatile building blocks for protein design. These data are available in our web server called F(old P)uzzle (fuzzle.uni-bayreuth.de), which allows to individually filter the dataset and create customized networks for folds of interest. We believe that our results serve as an invaluable resource for structural and evolutionary biologists and as raw material for the design of custom-made proteins.


Subject(s)
Evolution, Molecular , Protein Folding , Proteins/chemistry , Computational Biology , Internet , Models, Molecular , Protein Domains/genetics , Protein Engineering/trends , Proteins/genetics , Proteins/ultrastructure , Sequence Homology, Amino Acid , Software
SELECTION OF CITATIONS
SEARCH DETAIL
...