Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 69
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
ArXiv ; 2024 Jun 21.
Artículo en Inglés | MEDLINE | ID: mdl-38947925

RESUMEN

The weighted ensemble (WE) method stands out as a widely used segment-based sampling technique renowned for its rigorous treatment of kinetics. The WE framework typically involves initially mapping the configuration space onto a low-dimensional collective variable (CV) space and then partitioning it into bins. The efficacy of WE simulations heavily depends on the selection of CVs and binning schemes. The recently proposed State Predictive Information Bottleneck (SPIB) method has emerged as a promising tool for automatically constructing CVs from data and guiding enhanced sampling through an iterative manner. In this work, we advance this data-driven pipeline by incorporating prior expert knowledge. Our hybrid approach combines SPIB-learned CVs to enhance sampling in explored regions with expert-based CVs to guide exploration in regions of interest, synergizing the strengths of both methods. Through benchmarking on alanine dipeptide and chignoin systems, we demonstrate that our hybrid approach effectively guides WE simulations to sample states of interest, and reduces run-to-run variances. Moreover, our integration of the SPIB model also enhances the analysis and interpretation of WE simulation data by effectively identifying metastable states and pathways, and offering direct visualization of dynamics.

2.
ArXiv ; 2024 Jun 10.
Artículo en Inglés | MEDLINE | ID: mdl-38947932

RESUMEN

Markov state models (MSMs) have proven valuable in studying dynamics of protein conformational changes via statistical analysis of molecular dynamics (MD) simulations. In MSMs, the complex configuration space is coarse-grained into conformational states, with dynamics modeled by a series of Markovian transitions among these states at discrete lag times. Constructing the Markovian model at a specific lag time necessitates defining states that circumvent significant internal energy barriers, enabling internal dynamics relaxation within the lag time. This process effectively coarse-grains time and space, integrating out rapid motions within metastable states. Thus, MSMs possess a multi-resolution nature, where the granularity of states can be adjusted according to the time-resolution, offering flexibility in capturing system dynamics. This work introduces a continuous embedding approach for molecular conformations using the state predictive information bottleneck (SPIB), a framework that unifies dimensionality reduction and state space partitioning via a continuous, machine learned basis set. Without explicit optimization of the VAMP-based scores, SPIB demonstrates state-of-the-art performance in identifying slow dynamical processes and constructing predictive multi-resolution Markovian models. Through applications to well-validated mini-proteins, SPIB showcases unique advantages compared to competing methods. It autonomously and self-consistently adjusts the number of metastable states based on specified minimal time resolution, eliminating the need for manual tuning. While maintaining efficacy in dynamical properties, SPIB excels in accurately distinguishing metastable states and capturing numerous well-populated macrostates. This contrasts with existing VAMP-based methods, which often emphasize slow dynamics at the expense of incorporating numerous sparsely populated states. Furthermore, SPIB's ability to learn a low-dimensional continuous embedding of the underlying MSMs enhances the interpretation of dynamic pathways. With these benefits, we propose SPIB as an easy-to-implement methodology for end-to-end MSMs construction.

3.
J Chem Theory Comput ; 20(14): 6341-6349, 2024 Jul 23.
Artículo en Inglés | MEDLINE | ID: mdl-38991145

RESUMEN

Understanding drug residence times in target proteins is key to improving drug efficacy and understanding target recognition in biochemistry. While drug residence time is just as important as binding affinity, atomic-level understanding of drug residence times through molecular dynamics (MD) simulations has been difficult primarily due to the extremely long time scales. Recent advances in rare event sampling have allowed us to reach these time scales, yet predicting protein-ligand residence times remains a significant challenge. Here we present a semi-automated protocol to calculate the ligand residence times across 12 orders of magnitude of time scales. In our proposed framework, we integrate a deep learning-based method, the state predictive information bottleneck (SPIB), to learn an approximate reaction coordinate (RC) and use it to guide the enhanced sampling method metadynamics. We demonstrate the performance of our algorithm by applying it to six different protein-ligand complexes with available benchmark residence times, including the dissociation of the widely studied anticancer drug Imatinib (Gleevec) from both wild-type Abl kinase and drug-resistant mutants. We show how our protocol can recover quantitatively accurate residence times, potentially opening avenues for deeper insights into drug development possibilities and ligand recognition mechanisms.


Asunto(s)
Simulación de Dinámica Molecular , Proteínas , Ligandos , Proteínas/química , Proteínas/metabolismo , Mesilato de Imatinib/química , Algoritmos , Unión Proteica
4.
J Chem Theory Comput ; 20(12): 5352-5367, 2024 Jun 25.
Artículo en Inglés | MEDLINE | ID: mdl-38859575

RESUMEN

Markov state models (MSMs) have proven valuable in studying the dynamics of protein conformational changes via statistical analysis of molecular dynamics simulations. In MSMs, the complex configuration space is coarse-grained into conformational states, with dynamics modeled by a series of Markovian transitions among these states at discrete lag times. Constructing the Markovian model at a specific lag time necessitates defining states that circumvent significant internal energy barriers, enabling internal dynamics relaxation within the lag time. This process effectively coarse-grains time and space, integrating out rapid motions within metastable states. Thus, MSMs possess a multiresolution nature, where the granularity of states can be adjusted according to the time-resolution, offering flexibility in capturing system dynamics. This work introduces a continuous embedding approach for molecular conformations using the state predictive information bottleneck (SPIB), a framework that unifies dimensionality reduction and state space partitioning via a continuous, machine learned basis set. Without explicit optimization of the VAMP-based scores, SPIB demonstrates state-of-the-art performance in identifying slow dynamical processes and constructing predictive multiresolution Markovian models. Through applications to well-validated mini-proteins, SPIB showcases unique advantages compared to competing methods. It autonomously and self-consistently adjusts the number of metastable states based on a specified minimal time resolution, eliminating the need for manual tuning. While maintaining efficacy in dynamical properties, SPIB excels in accurately distinguishing metastable states and capturing numerous well-populated macrostates. This contrasts with existing VAMP-based methods, which often emphasize slow dynamics at the expense of incorporating numerous sparsely populated states. Furthermore, SPIB's ability to learn a low-dimensional continuous embedding of the underlying MSMs enhances the interpretation of dynamic pathways. With these benefits, we propose SPIB as an easy-to-implement methodology for end-to-end MSM construction.


Asunto(s)
Cadenas de Markov , Simulación de Dinámica Molecular , Proteínas/química , Conformación Proteica
5.
Proc Natl Acad Sci U S A ; 121(23): e2408742121, 2024 Jun 04.
Artículo en Inglés | MEDLINE | ID: mdl-38809708
6.
ArXiv ; 2024 Jul 04.
Artículo en Inglés | MEDLINE | ID: mdl-38659642

RESUMEN

Small molecule drug design hinges on obtaining co-crystallized ligand-protein structures. Despite AlphaFold2's strides in protein native structure prediction, its focus on apo structures overlooks ligands and associated holo structures. Moreover, designing selective drugs often benefits from the targeting of diverse metastable conformations. Therefore, direct application of AlphaFold2 models in virtual screening and drug discovery remains tentative. Here, we demonstrate an AlphaFold2 based framework combined with all-atom enhanced sampling molecular dynamics and induced fit docking, named AF2RAVE-Glide, to conduct computational model based small molecule binding of metastable protein kinase conformations, initiated from protein sequences. We demonstrate the AF2RAVE-Glide workflow on three different protein kinases and their type I and II inhibitors, with special emphasis on binding of known type II kinase inhibitors which target the metastable classical DFG-out state. These states are not easy to sample from AlphaFold2. Here we demonstrate how with AF2RAVE these metastable conformations can be sampled for different kinases with high enough accuracy to enable subsequent docking of known type II kinase inhibitors with more than 50% success rates across docking calculations. We believe the protocol should be deployable for other kinases and more proteins generally.

7.
bioRxiv ; 2024 Apr 20.
Artículo en Inglés | MEDLINE | ID: mdl-38659748

RESUMEN

Understanding drug residence times in target proteins is key to improving drug efficacy and understanding target recognition in biochemistry. While drug residence time is just as important as binding affinity, atomic-level understanding of drug residence times through molecular dynamics (MD) simulations has been difficult primarily due to the extremely long timescales. Recent advances in rare event sampling have allowed us to reach these timescales, yet predicting protein-ligand residence times remains a significant challenge. Here we present a semi-automated protocol to calculate the ligand residence times across 12 orders of magnitudes of timescales. In our proposed framework, we integrate a deep learning-based method, the state predictive information bottleneck (SPIB), to learn an approximate reaction coordinate (RC) and use it to guide the enhanced sampling method metadynamics. We demonstrate the performance of our algorithm by applying it to six different protein-ligand complexes with available benchmark residence times, including the dissociation of the widely studied anti-cancer drug Imatinib (Gleevec) from both wild-type Abl kinase and drug-resistant mutants. We show how our protocol can recover quantitatively accurate residence times, potentially opening avenues for deeper insights into drug development possibilities and ligand recognition mechanisms.

8.
J Chem Theory Comput ; 20(9): 3503-3513, 2024 May 14.
Artículo en Inglés | MEDLINE | ID: mdl-38649368

RESUMEN

While representation learning has been central to the rise of machine learning and artificial intelligence, a key problem remains in making the learned representations meaningful. For this, the typical approach is to regularize the learned representation through prior probability distributions. However, such priors are usually unavailable or are ad hoc. To deal with this, recent efforts have shifted toward leveraging the insights from physical principles to guide the learning process. In this spirit, we propose a purely dynamics-constrained representation learning framework. Instead of relying on predefined probabilities, we restrict the latent representation to follow overdamped Langevin dynamics with a learnable transition density─a prior driven by statistical mechanics. We show that this is a more natural constraint for representation learning in stochastic dynamical systems, with the crucial ability to uniquely identify the ground truth representation. We validate our framework for different systems including a real-world fluorescent DNA movie data set. We show that our algorithm can uniquely identify orthogonal, isometric, and meaningful latent representations.

9.
J Chem Inf Model ; 64(7): 2637-2644, 2024 Apr 08.
Artículo en Inglés | MEDLINE | ID: mdl-38453912

RESUMEN

Identifying and discovering druggable protein binding sites is an important early step in computer-aided drug discovery, but it remains a difficult task where most campaigns rely on a priori knowledge of binding sites from experiments. Here, we present a binding site prediction method called Graph Attention Site Prediction (GrASP) and re-evaluate assumptions in nearly every step in the site prediction workflow from data set preparation to model evaluation. GrASP is able to achieve state-of-the-art performance at recovering binding sites in PDB structures while maintaining a high degree of precision which will minimize wasted computation in downstream tasks such as docking and free energy perturbation.


Asunto(s)
Fármacos Anti-VIH , Sitios de Unión , Descubrimiento de Drogas , Redes Neurales de la Computación , Fuerza de la Mano
10.
J Phys Chem B ; 128(12): 3037-3045, 2024 Mar 28.
Artículo en Inglés | MEDLINE | ID: mdl-38502931

RESUMEN

In this study, we present a graph neural network (GNN)-based learning approach using an autoencoder setup to derive low-dimensional variables from features observed in experimental crystal structures. These variables are then biased in enhanced sampling to observe state-to-state transitions and reliable thermodynamic weights. In our approach, we used simple convolution and pooling methods. To verify the effectiveness of our protocol, we examined the nucleation of various allotropes and polymorphs of iron and glycine in their molten states. Our graph latent variables, when biased in well-tempered metadynamics, consistently show transitions between states and achieve accurate thermodynamic rankings in agreement with experiments, both of which are indicators of dependable sampling. This underscores the strength and promise of our GNN variables for improved sampling. The protocol shown here should be applicable for other systems and other sampling methods.

11.
Annu Rev Phys Chem ; 75(1): 347-370, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38382572

RESUMEN

Molecular dynamics (MD) enables the study of physical systems with excellent spatiotemporal resolution but suffers from severe timescale limitations. To address this, enhanced sampling methods have been developed to improve the exploration of configurational space. However, implementing these methods is challenging and requires domain expertise. In recent years, integration of machine learning (ML) techniques into different domains has shown promise, prompting their adoption in enhanced sampling as well. Although ML is often employed in various fields primarily due to its data-driven nature, its integration with enhanced sampling is more natural with many common underlying synergies. This review explores the merging of ML and enhanced MD by presenting different shared viewpoints. It offers a comprehensive overview of this rapidly evolving field, which can be difficult to stay updated on. We highlight successful strategies such as dimensionality reduction, reinforcement learning, and flow-based methods. Finally, we discuss open problems at the exciting ML-enhanced MD interface.

12.
J Phys Chem B ; 128(3): 755-767, 2024 Jan 25.
Artículo en Inglés | MEDLINE | ID: mdl-38205806

RESUMEN

Ligand unbinding is mediated by its free energy change, which has intertwined contributions from both energy and entropy. It is important, but not easy, to quantify their individual contributions to the free energy profile. We model hydrophobic ligand unbinding for two systems, a methane particle and a C60 fullerene, both unbinding from hydrophobic pockets in all-atom water. Using a modified deep learning framework, we learn a thermodynamically optimized reaction coordinate to describe the hydrophobic ligand dissociation for both systems. Interpretation of these reaction coordinates reveals the roles of entropic and enthalpic forces as the ligand and pocket sizes change. In both cases, we observe that the free-energy barrier to unbinding is dominated by entropy considerations. Furthermore, the process of methane unbinding is driven by methane solvation, while fullerene unbinding is driven first by pocket wetting and then fullerene wetting. For both solutes, the direct importance of the distance from the binding pocket to the learned reaction coordinate is present, but low. Our framework and subsequent feature important analysis thus give useful thermodynamic insight into hydrophobic ligand dissociation problems that are otherwise difficult to glean.

13.
J Phys Chem B ; 128(4): 1012-1021, 2024 Feb 01.
Artículo en Inglés | MEDLINE | ID: mdl-38262436

RESUMEN

Even though nucleation is ubiquitous in different science and engineering problems, investigating nucleation is extremely difficult due to the complicated ranges of time and length scales involved. In this work, we simulate NaCl nucleation in both molten and aqueous environments using enhanced sampling of all-atom molecular dynamics with deep-learning-based estimation of reaction coordinates. By incorporating various structural order parameters and learning the reaction coordinate as a function thereof, we achieve significantly improved sampling relative to traditional ad hoc descriptions of what drives nucleation, particularly in an aqueous medium. Our results reveal a one-step nucleation mechanism in both environments, with reaction coordinate analysis highlighting the importance of local ion density in distinguishing solid and liquid states. However, although fluctuations in the local ion density are necessary to drive nucleation, they are not sufficient. Our analysis shows that near the transition states, descriptors such as enthalpy and local structure become crucial. Our protocol proposed here enables robust nucleation analysis and phase sampling and could offer insights into nucleation mechanisms for generic small molecules in different environments.

14.
J Chem Inf Model ; 64(7): 2789-2797, 2024 Apr 08.
Artículo en Inglés | MEDLINE | ID: mdl-37981824

RESUMEN

Kinases compose one of the largest fractions of the human proteome, and their misfunction is implicated in many diseases, in particular, cancers. The ubiquitousness and structural similarities of kinases make specific and effective drug design difficult. In particular, conformational variability due to the evolutionarily conserved Asp-Phe-Gly (DFG) motif adopting in and out conformations and the relative stabilities thereof are key in structure-based drug design for ATP competitive drugs. These relative conformational stabilities are extremely sensitive to small changes in sequence and provide an important problem for sampling method development. Since the invention of AlphaFold2, the world of structure-based drug design has noticeably changed. In spite of it being limited to crystal-like structure prediction, several methods have also leveraged its underlying architecture to improve dynamics and enhanced sampling of conformational ensembles, including AlphaFold2-RAVE. Here, we extend AlphaFold2-RAVE and apply it to a set of kinases: the wild type DDR1 sequence and three mutants with single point mutations that are known to behave drastically differently. We show that AlphaFold2-RAVE is able to efficiently recover the changes in relative stability using transferable learned order parameters and potentials, thereby supplementing AlphaFold2 as a tool for exploration of Boltzmann-weighted protein conformations (Meller, A.; Bhakat, S.; Solieva, S.; Bowman, G. R. Accelerating Cryptic Pocket Discovery Using AlphaFold. J. Chem. Theory Comput. 2023, 19, 4355-4363).


Asunto(s)
Oligopéptidos , Inhibidores de Proteínas Quinasas , Humanos , Modelos Moleculares , Inhibidores de Proteínas Quinasas/química , Conformación Proteica , Oligopéptidos/química
15.
J Chem Theory Comput ; 19(24): 9093-9101, 2023 Dec 26.
Artículo en Inglés | MEDLINE | ID: mdl-38084039

RESUMEN

Understanding nucleation from aqueous solutions is of fundamental importance in a multitude of fields, ranging from materials science to biophysics. The complex solvent-mediated interactions in aqueous solutions hamper the development of a simple physical picture, elucidating the roles of different interactions in nucleation processes. In this work, we make use of three complementary techniques to disentangle the role played by short- and long-range interactions in solvent-mediated nucleation. Specifically, the first approach we utilize is the local molecular field (LMF) theory to renormalize long-range Coulomb electrostatics. Second, we use well-tempered metadynamics to speed up rare events governed by short-range interactions. Third, the deep learning-based State Predictive Information Bottleneck approach is employed in analyzing the reaction coordinate of the nucleation processes obtained from the LMF treatment coupled with well-tempered metadynamics. We find that the two-step nucleation mechanism can largely be captured by the short-range interactions, while the long-range interactions further contribute to the stability of the primary crystal state under ambient conditions. Furthermore, by analyzing the reaction coordinate obtained from the combined LMF-metadynamics treatment, we discern the fluctuations on different time scales, highlighting the need for long-range interactions when accounting for metastability.

16.
ArXiv ; 2023 Sep 07.
Artículo en Inglés | MEDLINE | ID: mdl-37731662

RESUMEN

Kinases compose one of the largest fractions of the human proteome, and their misfunction is implicated in many diseases, in particular cancers. The ubiquitousness and structural similarities of kinases makes specific and effective drug design difficult. In particular, conformational variability due to the evolutionarily conserved DFG motif adopting in and out conformations and the relative stabilities thereof are key in structure-based drug design for ATP competitive drugs. These relative conformational stabilities are extremely sensitive to small changes in sequence, and provide an important problem for sampling method development. Since the invention of AlphaFold2, the world of structure-based drug design has noticably changed. In spite of it being limited to crystal-like structure prediction, several methods have also leveraged its underlying architecture to improve dynamics and enhanced sampling of conformational ensembles, including AlphaFold2-RAVE. Here, we extend AlphaFold2-RAVE and apply it to a set of kinases: the wild type DDR1 sequence and three mutants with single point mutations that are known to behave drastically differently. We show that AlphaFold2-RAVE is able to efficiently recover the changes in relative stability using transferable learnt order parameters and potentials, thereby supplementing AlphaFold2 as a tool for exploration of Boltzmann-weighted protein conformations.

17.
bioRxiv ; 2023 Jul 28.
Artículo en Inglés | MEDLINE | ID: mdl-37546775

RESUMEN

Identifying and discovering druggable protein binding sites is an important early step in computer-aided drug discovery but remains a difficult task where most campaigns rely on a priori knowledge of binding sites from experiments. Here we present a novel binding site prediction method called Graph Attention Site Prediction (GrASP) and re-evaluate assumptions in nearly every step in the site prediction workflow from dataset preparation to model evaluation. GrASP is able to achieve state-of-the-art performance at recovering binding sites in PDB structures while maintaining a high degree of precision which will minimize wasted computation in downstream tasks such as docking and free energy perturbation.

18.
J Chem Theory Comput ; 19(14): 4351-4354, 2023 Jul 25.
Artículo en Inglés | MEDLINE | ID: mdl-37171364

RESUMEN

While AlphaFold2 is rapidly being adopted as a new standard in protein structure predictions, it is limited to single structures. This can be insufficient for the inherently dynamic world of biomolecules. In this Letter, we propose AlphaFold2-RAVE, an efficient protocol for obtaining Boltzmann-ranked ensembles from sequence. The method uses structural outputs from AlphaFold2 as initializations for artificial intelligence-augmented molecular dynamics. We release the method as an open-source code and demonstrate results on different proteins.

19.
Proc Natl Acad Sci U S A ; 120(7): e2216099120, 2023 Feb 14.
Artículo en Inglés | MEDLINE | ID: mdl-36757888

RESUMEN

Crystal nucleation is relevant across the domains of fundamental and applied sciences. However, in many cases, its mechanism remains unclear due to a lack of temporal or spatial resolution. To gain insights into the molecular details of nucleation, some form of molecular dynamics simulations is typically performed; these simulations, in turn, are limited by their ability to run long enough to sample the nucleation event thoroughly. To overcome the timescale limits in typical molecular dynamics simulations in a manner free of prior human bias, here, we employ the machine learning-augmented molecular dynamics framework "reweighted autoencoded variational Bayes for enhanced sampling (RAVE)." We study two molecular systems-urea and glycine-in explicit all-atom water, due to their enrichment in polymorphic structures and common utility in commercial applications. From our simulations, we observe multiple back-and-forth nucleation events of different polymorphs from homogeneous solution; from these trajectories, we calculate the relative ranking of finite-sized polymorph crystals embedded in solution, in terms of the free-energy difference between the finite-sized crystal polymorph and the original solution state. We further observe that the obtained reaction coordinates and transitions are highly nonclassical.

20.
Nat Commun ; 13(1): 7231, 2022 11 24.
Artículo en Inglés | MEDLINE | ID: mdl-36433982

RESUMEN

Recurrent neural networks have seen widespread use in modeling dynamical systems in varied domains such as weather prediction, text prediction and several others. Often one wishes to supplement the experimentally observed dynamics with prior knowledge or intuition about the system. While the recurrent nature of these networks allows them to model arbitrarily long memories in the time series used in training, it makes it harder to impose prior knowledge or intuition through generic constraints. In this work, we present a path sampling approach based on principle of Maximum Caliber that allows us to include generic thermodynamic or kinetic constraints into recurrent neural networks. We show the method here for a widely used type of recurrent neural network known as long short-term memory network in the context of supplementing time series collected from different application domains. These include classical Molecular Dynamics of a protein and Monte Carlo simulations of an open quantum system continuously losing photons to the environment and displaying Rabi oscillations. Our method can be easily generalized to other generative artificial intelligence models and to generic time series in different areas of physical and social sciences, where one wishes to supplement limited data with intuition or theory based corrections.


Asunto(s)
Algoritmos , Inteligencia Artificial , Redes Neurales de la Computación , Física , Método de Montecarlo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...