Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 149
Filter
2.
J Chem Theory Comput ; 20(15): 6518-6530, 2024 Aug 13.
Article in English | MEDLINE | ID: mdl-39088306

ABSTRACT

Absolute binding free energy (ABFE) calculations with all-atom molecular dynamics (MD) have the potential to greatly reduce costs in the first stages of drug discovery. Here, we introduce BAT2, the new version of the Binding Affinity Tool (BAT.py), designed to combine full automation of ABFE calculations with high-performance MD simulations, making it a potential tool for virtual screening. We describe and test several changes and new features that were incorporated into the code, such as relative restraints between the protein and the ligand instead of using fixed dummy atoms, support for the OpenMM simulation engine, a merged approach to the application/release of restraints, support for cobinders and proteins with multiple chains, and many others. We also reduced the simulation times for each ABFE calculation, assessing the effect on the expected robustness and accuracy of the calculations.


Subject(s)
Molecular Dynamics Simulation , Proteins , Thermodynamics , Proteins/chemistry , Proteins/metabolism , Ligands , Protein Binding , Software
3.
J Phys Chem B ; 128(29): 7043-7067, 2024 Jul 25.
Article in English | MEDLINE | ID: mdl-38989715

ABSTRACT

Force fields are a key component of physics-based molecular modeling, describing the energies and forces in a molecular system as a function of the positions of the atoms and molecules involved. Here, we provide a review and scientific status report on the work of the Open Force Field (OpenFF) Initiative, which focuses on the science, infrastructure and data required to build the next generation of biomolecular force fields. We introduce the OpenFF Initiative and the related OpenFF Consortium, describe its approach to force field development and software, and discuss accomplishments to date as well as future plans. OpenFF releases both software and data under open and permissive licensing agreements to enable rapid application, validation, extension, and modification of its force fields and software tools. We discuss lessons learned to date in this new approach to force field development. We also highlight ways that other force field researchers can get involved, as well as some recent successes of outside researchers taking advantage of OpenFF tools and data.

4.
J Chem Inf Model ; 64(14): 5492-5499, 2024 Jul 22.
Article in English | MEDLINE | ID: mdl-38950281

ABSTRACT

Predicting the activities of new compounds against biophysical or phenotypic assays based on the known activities of one or a few existing compounds is a common goal in early stage drug discovery. This problem can be cast as a "few-shot learning" challenge, and prior studies have developed few-shot learning methods to classify compounds as active versus inactive. However, the ability to go beyond classification and rank compounds by expected affinity is more valuable. We describe Few-Shot Compound Activity Prediction (FS-CAP), a novel neural architecture trained on a large bioactivity data set to predict compound activities against an assay outside the training set, based on only the activities of a few known compounds against the same assay. Our model aggregates encodings generated from the known compounds and their activities to capture assay information and uses a separate encoder for the new compound whose activity is to be predicted. The new method provides encouraging results relative to traditional chemical-similarity-based techniques as well as other state-of-the-art few-shot learning methods in tests on a variety of ligand-based drug discovery settings and data sets. The code for FS-CAP is available at https://github.com/Rose-STL-Lab/FS-CAP.


Subject(s)
Drug Discovery , Ligands , Drug Discovery/methods , Machine Learning , Neural Networks, Computer
5.
J Chem Theory Comput ; 20(14): 6328-6340, 2024 Jul 23.
Article in English | MEDLINE | ID: mdl-38989926

ABSTRACT

The structure-based technologies most widely used to rank the affinities of candidate small molecule drugs for proteins range from faster but less reliable docking methods to slower but more accurate explicit solvent free energy methods. In recent years, we have advanced another technology, which is called mining minima because it "mines" out the main contributions to the chemical potentials of the free and bound molecular species by identifying and characterizing their main local energy minima. The present study provides systematic benchmarks of the accuracy and computational speed of mining minima, as implemented in the VeraChem Mining Minima Generation 2 (VM2) code, across two well-regarded protein-ligand benchmark data sets, for which there are already benchmark data for docking, free energy, and other computational methods. A core result is that VM2's accuracy approaches that of explicit solvent free energy methods at a far lower computational cost. In finer-grained analyses, we also examine the influence of various run settings, such as the treatment of crystallographic water molecules, on the accuracy, and define the costs in time and dollars of representative runs on Amazon Web Services (AWS) compute instances with various CPU and GPU combinations. We also use the benchmark data to determine the importance of VM2's correction from generalized Born to finite-difference Poisson-Boltzmann results for each energy well and find that this correction affords a remarkably consistent improvement in accuracy at a modest computational cost. The present results establish VM2 as a distinctive technology for early-stage drug discovery, which provides a strong combination of efficiency and predictivity.


Subject(s)
Proteins , Ligands , Proteins/chemistry , Proteins/metabolism , Thermodynamics , Protein Binding , Molecular Docking Simulation
6.
J Chem Theory Comput ; 20(7): 2871-2887, 2024 Apr 09.
Article in English | MEDLINE | ID: mdl-38536144

ABSTRACT

The concept that a fluid has a position-dependent free energy density appears in the literature but has not been fully developed or accepted. We set this concept on an unambiguous theoretical footing via the following strategy. First, we set forth four desiderata that should be satisfied by any definition of the position-dependent free energy density, f(R), in a system comprising only a fluid and a rigid solute: its volume integral, plus the fixed internal energy of the solute, should be the system free energy; it deviates from its bulk value, fbulk, near a solute but should asymptotically approach fbulk with increasing distance from the solute; it should go to zero where the solvent density goes to zero; and it should be well-defined in the most general case of a fluid made up of flexible molecules with an arbitrary interaction potential. Second, we use statistical thermodynamics to formulate a definition of the free energy density that satisfies these desiderata. Third, we show how any free energy density satisfying the desiderata may be used to analyze molecular processes in solution. In particular, because the spatial integral of f(R) equals the free energy of the system, it can be used to compute free energy changes that result from the rearrangement of solutes as well as the forces exerted on the solutes by the solvent. This enables the use of a thermodynamic analysis of water in protein binding sites to inform ligand design. Finally, we discuss related literature and address published concerns regarding the thermodynamic plausibility of a position-dependent free energy density. The theory presented here has applications in theoretical and computational chemistry and may be further generalizable beyond fluids, such as to solids and macromolecules.

7.
J Chem Theory Comput ; 20(3): 1293-1305, 2024 Feb 13.
Article in English | MEDLINE | ID: mdl-38240687

ABSTRACT

We present an efficient polarizable electrostatic model, utilizing typed, atom-centered polarizabilities and the fast direct approximation, designed for efficient use in molecular dynamics (MD) simulations. The model provides two convenient approaches for assigning partial charges in the context of atomic polarizabilities. One is a generalization of RESP, called RESP-dPol, and the other, AM1-BCC-dPol, is an adaptation of the widely used AM1-BCC method. Both are designed to accurately replicate gas-phase quantum mechanical electrostatic potentials. Benchmarks of this polarizable electrostatic model against gas-phase dipole moments, molecular polarizabilities, bulk liquid densities, and static dielectric constants of organic liquids show good agreement with the reference values. Of note, the model yields markedly more accurate dielectric constants of organic liquids, relative to a matched nonpolarizable force field. MD simulations with this method, which is currently parametrized for molecules containing elements C, N, O, and H, run only about 3.6-fold slower than fixed charge force fields, while simulations with the self-consistent mutual polarization average 4.5-fold slower. Our results suggest that RESP-dPol and AM1-BCC-dPol afford improved accuracy relative to fixed charge force fields and are good starting points for developing general, affordable, and transferable polarizable force fields. The software implementing these approaches has been designed to utilize the force field fitting frameworks developed and maintained by the Open Force Field Initiative, setting the stage for further exploration of this approach to polarizable force field development.

8.
J Chem Theory Comput ; 20(1): 239-252, 2024 Jan 09.
Article in English | MEDLINE | ID: mdl-38147689

ABSTRACT

Software to more rapidly and accurately predict protein-ligand binding affinities is of high interest for early-stage drug discovery, and physics-based methods are among the most widely used technologies for this purpose. The accuracy of these methods depends critically on the accuracy of the potential functions that they use. Potential functions are typically trained against a combination of quantum chemical and experimental data. However, although binding affinities are among the most important quantities to predict, experimental binding affinities have not to date been integrated into the experimental data set used to train potential functions. In recent years, the use of host-guest complexes as simple and tractable models of binding thermodynamics has gained popularity due to their small size and simplicity, relative to protein-ligand systems. Host-guest complexes can also avoid ambiguities that arise in protein-ligand systems such as uncertain protonation states. Thus, experimental host-guest binding data are an appealing additional data type to integrate into the experimental data set used to optimize potential functions. Here, we report the extension of the Open Force Field Evaluator framework to enable the systematic calculation of host-guest binding free energies and their gradients with respect to force field parameters, coupled with the curation of 126 host-guest complexes with available experimental binding free energies. As an initial application of this novel infrastructure, we optimized generalized Born (GB) cavity radii for the OBC2 GB implicit solvent model against experimental data for 36 host-guest systems. This refitting led to a dramatic improvement in accuracy for both the training set and a separate test set with 90 additional host-guest systems. The optimized radii also showed encouraging transferability from host-guest systems to 59 protein-ligand systems. However, the new radii are significantly smaller than the baseline radii and lead to excessively favorable hydration free energies (HFEs). Thus, users of the OBC2 GB model currently may choose between GB cavity radii that yield more accurate binding affinities and GB cavity radii that yield more accurate HFEs. We suspect that achieving good accuracy on both will require more far-reaching adjustments to the GB model. We note that binding free-energy calculations using the OBC2 model in OpenMM gain about a 10× speedup relative to corresponding explicit solvent calculations, suggesting a future role for implicit solvent absolute binding free-energy (ABFE) calculations in virtual compound screening. This study proves the principle of using host-guest systems to train potential functions that are transferrable to protein-ligand systems and provides an infrastructure that enables a range of applications.


Subject(s)
Proteins , Software , Ligands , Proteins/chemistry , Protein Binding , Solvents/chemistry , Thermodynamics , Molecular Dynamics Simulation
9.
Phys Chem Chem Phys ; 26(3): 2035-2043, 2024 Jan 17.
Article in English | MEDLINE | ID: mdl-38126539

ABSTRACT

Model systems are widely used in biology and chemistry to gain insight into more complex systems. In the field of computational chemistry, researchers use host-guest systems, relatively simple exemplars of noncovalent binding, to train and test the computational methods used in drug discovery. Indeed, host-guest systems have been developed to support the community-wide blinded SAMPL prediction challenges for over a decade. While seeking new host-guest systems for the recent SAMPL9 binding prediction challenge, which is the focus of the present PCCP Themed Collection, we identified phenothiazine as a privileged scaffold for guests of ß cyclodextrin (ßCD) and its derivatives. Building on this observation, we used calorimetry and NMR spectroscopy to characterize the noncovalent association of native ßCD and three methylated derivatives of ßCD with five phenothiazine drugs. The strongest association observed, that of thioridazine and one of the methyl derivatives, exceeds the well-known high affinity of rimantidine with ßCD. Intriguingly, however, methylation of ßCD at the 3 position abolished detectible binding for all of the drugs studied. The dataset has a clear pattern of entropy-enthalpy compensation. The NMR data show that all of the drugs position at least one aromatic proton at the secondary face of the CD, and most also show evidence of deep penetration of the binding site. The results of this study were used in the SAMPL9 blinded binding affinity-prediction challenge, which are detailed in accompanying papers of the present Themed Collection. These data also open the phenothiazines and, potentially, chemically similar drugs, such as the tricyclic antidepressants, as relatively potent binders of ßCD, setting the stage for future SAMPL challenge datasets and for possible applications as drug reversal agents.


Subject(s)
Cyclodextrins , Cyclodextrins/chemistry , Phenothiazines , Binding Sites , Thermodynamics
10.
ArXiv ; 2023 Nov 27.
Article in English | MEDLINE | ID: mdl-38076516

ABSTRACT

Predicting the activities of compounds against protein-based or phenotypic assays using only a few known compounds and their activities is a common task in target-free drug discovery. Existing few-shot learning approaches are limited to predicting binary labels (active/inactive). However, in real-world drug discovery, degrees of compound activity are highly relevant. We study Few-Shot Compound Activity Prediction (FS-CAP) and design a novel neural architecture to meta-learn continuous compound activities across large bioactivity datasets. Our model aggregates encodings generated from the known compounds and their activities to capture assay information. We also introduce a separate encoder for the unknown compound. We show that FS-CAP surpasses traditional similarity-based techniques as well as other state of the art few-shot learning methods on a variety of target-free drug discovery settings and datasets.

11.
Chem Sci ; 14(42): 11818-11829, 2023 Nov 01.
Article in English | MEDLINE | ID: mdl-37920355

ABSTRACT

The thermodynamic parameters of host-guest binding can be used to describe, understand, and predict molecular recognition events in aqueous systems. However, interpreting binding thermodynamics remains challenging, even for these relatively simple molecules, as they are determined by both direct and solvent-mediated host-guest interactions. In this contribution, we focus on the contributions of water to binding by studying binding thermodynamics, both experimentally and computationally, for a series of nearly rigid, electrically neutral host-guest systems and report the temperature-dependent thermodynamic binding contributions ΔGb(T), ΔHb(T), ΔSb(T), and ΔCp,b. Combining isothermal titration calorimetry (ITC) measurements with molecular dynamics (MD) simulations, we provide insight into the binding forces at play for the macrocyclic hosts cucurbit[n]uril (CBn, n = 7-8) and ß-cyclodextrin (ß-CD) with a range of guest molecules. We find consistently negative changes in heat capacity on binding (ΔCp,b) for all systems studied herein - as well as for literature host-guest systems - indicating increased enthalpic driving forces for binding at higher temperatures. We ascribe these trends to solvation effects, as the solvent properties of water deteriorate as temperature rises. Unlike the entropic and enthalpic contributions to binding, with their differing signs and magnitudes for the classical and non-classical hydrophobic effect, heat capacity changes appear to be a unifying and more general feature of host-guest complex formation in water. This work has implications for understanding protein-ligand interactions and other complex systems in aqueous environments.

12.
J Chem Theory Comput ; 19(11): 3251-3275, 2023 Jun 13.
Article in English | MEDLINE | ID: mdl-37167319

ABSTRACT

We introduce the Open Force Field (OpenFF) 2.0.0 small molecule force field for drug-like molecules, code-named Sage, which builds upon our previous iteration, Parsley. OpenFF force fields are based on direct chemical perception, which generalizes easily to highly diverse sets of chemistries based on substructure queries. Like the previous OpenFF iterations, the Sage generation of OpenFF force fields was validated in protein-ligand simulations to be compatible with AMBER biopolymer force fields. In this work, we detail the methodology used to develop this force field, as well as the innovations and improvements introduced since the release of Parsley 1.0.0. One particularly significant feature of Sage is a set of improved Lennard-Jones (LJ) parameters retrained against condensed phase mixture data, the first refit of LJ parameters in the OpenFF small molecule force field line. Sage also includes valence parameters refit to a larger database of quantum chemical calculations than previous versions, as well as improvements in how this fitting is performed. Force field benchmarks show improvements in general metrics of performance against quantum chemistry reference data such as root-mean-square deviations (RMSD) of optimized conformer geometries, torsion fingerprint deviations (TFD), and improved relative conformer energetics (ΔΔE). We present a variety of benchmarks for these metrics against our previous force fields as well as in some cases other small molecule force fields. Sage also demonstrates improved performance in estimating physical properties, including comparison against experimental data from various thermodynamic databases for small molecule properties such as ΔHmix, ρ(x), ΔGsolv, and ΔGtrans. Additionally, we benchmarked against protein-ligand binding free energies (ΔGbind), where Sage yields results statistically similar to previous force fields. All the data is made publicly available along with complete details on how to reproduce the training results at https://github.com/openforcefield/openff-sage.


Subject(s)
Benchmarking , Proteins , Ligands , Proteins/chemistry , Thermodynamics , Entropy
13.
Chemistry ; 29(20): e202203958, 2023 Apr 06.
Article in English | MEDLINE | ID: mdl-36617500

ABSTRACT

Here, we present remarkable epoxyketone-based proteasome inhibitors with low nanomolar in vitro potency for blood-stage Plasmodium falciparum and low cytotoxicity for human cells. Our best compound has more than 2,000-fold greater selectivity for erythrocytic-stage P. falciparum over HepG2 and H460 cells, which is largely driven by the accommodation of the parasite proteasome for a D-amino acid in the P3 position and the preference for a difluorobenzyl group in the P1 position. We isolated the proteasome from P. falciparum cell extracts and determined that the best compound is 171-fold more potent at inhibiting the ß5 subunit of P. falciparum proteasome when compared to the same subunit of the human constitutive proteasome. These compounds also significantly reduce parasitemia in a P. berghei mouse infection model and prolong survival of animals by an average of 6 days. The current epoxyketone inhibitors are ideal starting compounds for orally bioavailable anti-malarial drugs.


Subject(s)
Antimalarials , Plasmodium , Mice , Animals , Humans , Proteasome Inhibitors/chemistry , Proteasome Endopeptidase Complex/chemistry , Plasmodium falciparum , Antimalarials/pharmacology
14.
Proc Mach Learn Res ; 162: 5777-5792, 2022 Jul.
Article in English | MEDLINE | ID: mdl-36193121

ABSTRACT

Generation of drug-like molecules with high binding affinity to target proteins remains a difficult and resource-intensive task in drug discovery. Existing approaches primarily employ reinforcement learning, Markov sampling, or deep generative models guided by Gaussian processes, which can be prohibitively slow when generating molecules with high binding affinity calculated by computationally-expensive physics-based methods. We present Latent Inceptionism on Molecules (LIMO), which significantly accelerates molecule generation with an inceptionism-like technique. LIMO employs a variational autoencoder-generated latent space and property prediction by two neural networks in sequence to enable faster gradient-based reverse-optimization of molecular properties. Comprehensive experiments show that LIMO performs competitively on benchmark tasks and markedly outperforms state-of-the-art techniques on the novel task of generating drug-like compounds with high binding affinity, reaching nanomolar range against two protein targets. We corroborate these docking-based results with more accurate molecular dynamics-based calculations of absolute binding free energy and show that one of our generated drug-like compounds has a predicted K D (a measure of binding affinity) of 6 · 10-14 M against the human estrogen receptor, well beyond the affinities of typical early-stage drug candidates and most FDA-approved drugs to their respective targets. Code is available at https://github.com/Rose-STL-Lab/LIMO.

15.
Sci Rep ; 12(1): 13640, 2022 08 10.
Article in English | MEDLINE | ID: mdl-35948614

ABSTRACT

We determined the effectiveness of absolute binding free energy (ABFE) calculations to refine the selection of active compounds in virtual compound screening, a setting where the more commonly used relative binding free energy approach is not readily applicable. To do this, we conducted baseline docking calculations of structurally diverse compounds in the DUD-E database for three targets, BACE1, CDK2 and thrombin, followed by ABFE calculations for compounds with high docking scores. The docking calculations alone achieved solid enrichment of active compounds over decoys. Encouragingly, the ABFE calculations then improved on this baseline. Analysis of the results emphasizes the importance of establishing high quality ligand poses as starting points for ABFE calculations, a nontrivial goal when processing a library of diverse compounds without informative co-crystal structures. Overall, our results suggest that ABFE calculations can play a valuable role in the drug discovery process.


Subject(s)
Amyloid Precursor Protein Secretases , Aspartic Acid Endopeptidases , Entropy , Ligands , Molecular Docking Simulation , Protein Binding
16.
Nat Rev Chem ; 6(4): 287-295, 2022 Apr.
Article in English | MEDLINE | ID: mdl-35783295

ABSTRACT

One aspirational goal of computational chemistry is to predict potent and drug-like binders for any protein, such that only those that bind are synthesized. In this Roadmap, we describe the launch of Critical Assessment of Computational Hit-finding Experiments (CACHE), a public benchmarking project to compare and improve small molecule hit-finding algorithms through cycles of prediction and experimental testing. Participants will predict small molecule binders for new and biologically relevant protein targets representing different prediction scenarios. Predicted compounds will be tested rigorously in an experimental hub, and all predicted binders as well as all experimental screening data, including the chemical structures of experimentally tested compounds, will be made publicly available, and not subject to any intellectual property restrictions. The ability of a range of computational approaches to find novel binders will be evaluated, compared, and openly published. CACHE will launch 3 new benchmarking exercises every year. The outcomes will be better prediction methods, new small molecule binders for target proteins of importance for fundamental biology or drug discovery, and a major technological step towards achieving the goal of Target 2035, a global initiative to identify pharmacological probes for all human proteins.

17.
Chemistry ; 28(5): e202103438, 2022 Jan 24.
Article in English | MEDLINE | ID: mdl-34811828

ABSTRACT

Recently, we presented a strategy for packaging peptides as side-chains in high-density brush polymers. For this globular protein-like polymer (PLP) formulation, therapeutic peptides were shown to resist proteolytic degradation, enter cells efficiently and maintain biological function. In this paper, we establish the role charge plays in dictating the cellular uptake of these peptide formulations, finding that peptides with a net positive charge will enter cells when polymerized, while those formed from anionic or neutral peptides remain outside of cells. Given these findings, we explored whether cellular uptake could be selectively induced by a stimulus. In our design, a cationic peptide is appended to a sequence of charge-neutralizing anionic amino acids through stimuli-responsive cleavable linkers. As a proof-of-concept study, we tested this strategy with two different classes of stimuli, exogenous UV light and an enzyme (a matrix metalloproteinase) associated with the inflammatory response. The key finding is that these materials enter cells only when acted upon by the stimulus. This approach makes it possible to achieve delivery of the polymers, therapeutic peptides or an appended cargo into cells in response to an appropriate stimulus.


Subject(s)
Peptides , Polymers , Peptide Hydrolases , Polymerization , Proteins
18.
J Chem Theory Comput ; 17(12): 7366-7372, 2021 Dec 14.
Article in English | MEDLINE | ID: mdl-34762421

ABSTRACT

Molecular dynamics (MD) simulations of proteins are commonly used to sample from the Boltzmann distribution of conformational states, with wide-ranging applications spanning chemistry, biophysics, and drug discovery. However, MD can be inefficient at equilibrating water occupancy for buried cavities in proteins that are inaccessible to the surrounding solvent. Indeed, the time needed for water molecules to equilibrate between the bulk solvent and the binding site can be well beyond what is practical with standard MD, which typically ranges from hundreds of nanoseconds to a few microseconds. We recently introduced a hybrid Monte Carlo/MD (MC/MD) method, which speeds up the equilibration of water between buried cavities and the surrounding solvent, while sampling from the thermodynamically correct distribution of states. While the initial implementation of the MC functionality led to considerable slowing of the overall simulations, here we address this problem with a parallel MC algorithm implemented on graphical processing units. This results in speed-ups of 10-fold to 1000-fold over the original MC/MD algorithm, depending on the system and simulation parameters. The present method is available for use in the AMBER simulation software.

19.
J Chem Inf Model ; 61(11): 5362-5376, 2021 11 22.
Article in English | MEDLINE | ID: mdl-34652141

ABSTRACT

One of the main challenges of structure-based virtual screening (SBVS) is the incorporation of the receptor's flexibility, as its explicit representation in every docking run implies a high computational cost. Therefore, a common alternative to include the receptor's flexibility is the approach known as ensemble docking. Ensemble docking consists of using a set of receptor conformations and performing the docking assays over each of them. However, there is still no agreement on how to combine the ensemble docking results to obtain the final ligand ranking. A common choice is to use consensus strategies to aggregate the ensemble docking scores, but these strategies exhibit slight improvement regarding the single-structure approach. Here, we claim that using machine learning (ML) methodologies over the ensemble docking results could improve the predictive power of SBVS. To test this hypothesis, four proteins were selected as study cases: CDK2, FXa, EGFR, and HSP90. Protein conformational ensembles were built from crystallographic structures, whereas the evaluated compound library comprised up to three benchmarking data sets (DUD, DEKOIS 2.0, and CSAR-2012) and cocrystallized molecules. Ensemble docking results were processed through 30 repetitions of 4-fold cross-validation to train and validate two ML classifiers: logistic regression and gradient boosting trees. Our results indicate that the ML classifiers significantly outperform traditional consensus strategies and even the best performance case achieved with single-structure docking. We provide statistical evidence that supports the effectiveness of ML to improve the ensemble docking performance.


Subject(s)
Machine Learning , Proteins , Benchmarking , Ligands , Molecular Docking Simulation , Protein Binding , Protein Conformation , Proteins/metabolism
20.
J Chem Theory Comput ; 17(10): 6262-6280, 2021 Oct 12.
Article in English | MEDLINE | ID: mdl-34551262

ABSTRACT

We present a methodology for defining and optimizing a general force field for classical molecular simulations, and we describe its use to derive the Open Force Field 1.0.0 small-molecule force field, codenamed Parsley. Rather than using traditional atom typing, our approach is built on the SMIRKS-native Open Force Field (SMIRNOFF) parameter assignment formalism, which handles increases in the diversity and specificity of the force field definition without needlessly increasing the complexity of the specification. Parameters are optimized with the ForceBalance tool, based on reference quantum chemical data that include torsion potential energy profiles, optimized gas-phase structures, and vibrational frequencies. These quantum reference data are computed and are maintained with QCArchive, an open-source and freely available distributed computing and database software ecosystem. In this initial application of the method, we present essentially a full optimization of all valence parameters and report tests of the resulting force field against compounds and data types outside the training set. These tests show improvements in optimized geometries and conformational energetics and demonstrate that Parsley's accuracy for liquid properties is similar to that of other general force fields, as is accuracy on binding free energies. We find that this initial Parsley force field affords accuracy similar to that of other general force fields when used to calculate relative binding free energies spanning 199 protein-ligand systems. Additionally, the resulting infrastructure allows us to rapidly optimize an entirely new force field with minimal human intervention.


Subject(s)
Benchmarking , Petroselinum , Ecosystem , Humans , Ligands , Molecular Conformation
SELECTION OF CITATIONS
SEARCH DETAIL