Search | VHL Regional Portal

SAMPL7 physical property prediction from EC-RISM theory.

Tielker, Nicolas; Güssregen, Stefan; Kast, Stefan M.

J Comput Aided Mol Des ; 35(8): 933-941, 2021 08.

Article in English | MEDLINE | ID: mdl-34278539

ABSTRACT

Inspired by the successful application of the embedded cluster reference interaction site model (EC-RISM), a combination of quantum-mechanical calculations with three-dimensional RISM theory to predict Gibbs energies of species in solution within the SAMPL6.1 (acidity constants, pKa) and SAMPL6.2 (octanol-water partition coefficients, log P) the methodology was applied to the recent SAMPL7 physical property challenge on aqueous pKa and octanol-water log P values. Not part of the challenge but provided by the organizers, we also computed distribution coefficients log D7.4 from predicted pKa and log P data. While macroscopic pKa predictions compared very favorably with experimental data (root mean square error, RMSE 0.72 pK units), the performance of the log P model (RMSE 1.84) fell behind expectations from the SAMPL6.2 challenge, leading to reasonable log D7.4 predictions (RMSE 1.69) from combining the independent calculations. In the post-submission phase, conformations generated by different methodology yielded results that did not significantly improve the original predictions. While overall satisfactory compared to previous log D challenges, the predicted data suggest that further effort is needed for optimizing the robustness of the partition coefficient model within EC-RISM calculations and for shaping the agreement between experimental conditions and the corresponding model description.

Subject(s)

1-Octanol/chemistry , Computer Simulation , Models, Chemical , Quantum Theory , Thermodynamics , Water/chemistry , Linear Models , Physical Phenomena , Solubility

Evaluation of log P, pK_a, and log D predictions from the SAMPL7 blind challenge.

Bergazin, Teresa Danielle; Tielker, Nicolas; Zhang, Yingying; Mao, Junjun; Gunner, M R; Francisco, Karol; Ballatore, Carlo; Kast, Stefan M; Mobley, David L.

J Comput Aided Mol Des ; 35(7): 771-802, 2021 07.

Article in English | MEDLINE | ID: mdl-34169394

ABSTRACT

The Statistical Assessment of Modeling of Proteins and Ligands (SAMPL) challenges focuses the computational modeling community on areas in need of improvement for rational drug design. The SAMPL7 physical property challenge dealt with prediction of octanol-water partition coefficients and pKa for 22 compounds. The dataset was composed of a series of N-acylsulfonamides and related bioisosteres. 17 research groups participated in the log P challenge, submitting 33 blind submissions total. For the pKa challenge, 7 different groups participated, submitting 9 blind submissions in total. Overall, the accuracy of octanol-water log P predictions in the SAMPL7 challenge was lower than octanol-water log P predictions in SAMPL6, likely due to a more diverse dataset. Compared to the SAMPL6 pKa challenge, accuracy remains unchanged in SAMPL7. Interestingly, here, though macroscopic pKa values were often predicted with reasonable accuracy, there was dramatically more disagreement among participants as to which microscopic transitions produced these values (with methods often disagreeing even as to the sign of the free energy change associated with certain transitions), indicating far more work needs to be done on pKa prediction methods.

Subject(s)

Computational Biology/statistics & numerical data , Computer Simulation/statistics & numerical data , Software/statistics & numerical data , Sulfonamides/chemistry , Drug Design/statistics & numerical data , Entropy , Humans , Ligands , Models, Chemical , Models, Statistical , Octanols/chemistry , Quantum Theory , Solubility , Solvents/chemistry , Sulfonamides/therapeutic use , Thermodynamics , Water/chemistry

Quantum-mechanical property prediction of solvated drug molecules: what have we learned from a decade of SAMPL blind prediction challenges?

Tielker, Nicolas; Eberlein, Lukas; Hessler, Gerhard; Schmidt, K Friedemann; Güssregen, Stefan; Kast, Stefan M.

J Comput Aided Mol Des ; 35(4): 453-472, 2021 04.

Article in English | MEDLINE | ID: mdl-33079358

ABSTRACT

Joint academic-industrial projects supporting drug discovery are frequently pursued to deploy and benchmark cutting-edge methodical developments from academia in a real-world industrial environment at different scales. The dimensionality of tasks ranges from small molecule physicochemical property assessment over protein-ligand interaction up to statistical analyses of biological data. This way, method development and usability both benefit from insights gained at both ends, when predictiveness and readiness of novel approaches are confirmed, but the pharmaceutical drug makers get early access to novel tools for the quality of drug products and benefit of patients. Quantum-mechanical and simulation methods particularly fall into this group of methods, as they require skills and expense in their development but also significant resources in their application, thus are comparatively slowly dripping into the realm of industrial use. Nevertheless, these physics-based methods are becoming more and more useful. Starting with a general overview of these and in particular quantum-mechanical methods for drug discovery we review a decade-long and ongoing collaboration between Sanofi and the Kast group focused on the application of the embedded cluster reference interaction site model (EC-RISM), a solvation model for quantum chemistry, to study small molecule chemistry in the context of joint participation in several SAMPL (Statistical Assessment of Modeling of Proteins and Ligands) blind prediction challenges. Starting with early application to tautomer equilibria in water (SAMPL2) the methodology was further developed to allow for challenge contributions related to predictions of distribution coefficients (SAMPL5) and acidity constants (SAMPL6) over the years. Particular emphasis is put on a frequently overlooked aspect of measuring the quality of models, namely the retrospective analysis of earlier datasets and predictions in light of more recent and advanced developments. We therefore demonstrate the performance of the current methodical state of the art as developed and optimized for the SAMPL6 pKa and octanol-water log P challenges when re-applied to the earlier SAMPL5 cyclohexane-water log D and SAMPL2 tautomer equilibria datasets. Systematic improvement is not consistently found throughout despite the similarity of the problem class, i.e. protonation reactions and phase distribution. Hence, it is possible to learn about hidden bias in model assessment, as results derived from more elaborate methods do not necessarily improve quantitative agreement. This indicates the role of chance or coincidence for model development on the one hand which allows for the identification of systematic error and opportunities toward improvement and reveals possible sources of experimental uncertainty on the other. These insights are particularly useful for further academia-industry collaborations, as both partners are then enabled to optimize both the computational and experimental settings for data generation.

Subject(s)

Drug Discovery , Pharmaceutical Preparations/chemistry , Quantum Theory , Computer Simulation , Cyclohexanes/chemistry , Ligands , Models, Chemical , Solubility , Solvents/chemistry , Thermodynamics , Water/chemistry

The SAMPL6 challenge on predicting octanol-water partition coefficients from EC-RISM theory.

Tielker, Nicolas; Tomazic, Daniel; Eberlein, Lukas; Güssregen, Stefan; Kast, Stefan M.

J Comput Aided Mol Des ; 34(4): 453-461, 2020 04.

Article in English | MEDLINE | ID: mdl-31981015

ABSTRACT

Results are reported for octanol-water partition coefficients (log P) of the neutral states of drug-like molecules provided during the SAMPL6 (Statistical Assessment of Modeling of Proteins and Ligands) blind prediction challenge from applying the "embedded cluster reference interaction site model" (EC-RISM) as a solvation model for quantum-chemical calculations. Following the strategy outlined during earlier SAMPL challenges we first train 1- and 2-parameter water-free ("dry") and water-saturated ("wet") models for n-octanol solvation Gibbs energies with respect to experimental values from the "Minnesota Solvation Database" (MNSOL), yielding a root mean square error (RMSE) of 1.5 kcal mol-1 for the best-performing 2-parameter wet model, while the optimal water model developed for the pKa part of the SAMPL6 challenge is kept unchanged (RMSE 1.6 kcal mol-1 for neutral compounds from a model trained on both neutral and ionic species). Applying these models to the blind prediction set yields a log P RMSE of less than 0.5 for our best model (2-parameters, wet). Further analysis of our results reveals that a single compound is responsible for most of the error, SM15, without which the RMSE drops to 0.2. Since this is the only compound in the challenge dataset with a hydroxyl group we investigate other alcohols for which Gibbs energy of solvation data for both water and n-octanol are available in the MNSOL database to demonstrate a systematic cause of error and to discuss strategies for improvement.

Subject(s)

1-Octanol/chemistry , Octanols/chemistry , Thermodynamics , Water/chemistry , Cyclohexanes/chemistry , Ligands , Models, Chemical , Quantum Theory

Pressure-dependent electronic structure calculations using integral equation-based solvation models.

Pongratz, Tim; Kibies, Patrick; Eberlein, Lukas; Tielker, Nicolas; Hölzl, Christoph; Imoto, Sho; Beck Erlach, Markus; Kurrmann, Simon; Schummel, Paul Hendrik; Hofmann, Martin; Reiser, Oliver; Winter, Roland; Kremer, Werner; Kalbitzer, Hans Robert; Marx, Dominik; Horinek, Dominik; Kast, Stefan M.

Biophys Chem ; 257: 106258, 2020 02.

Article in English | MEDLINE | ID: mdl-31881504

ABSTRACT

Recent methodological progress in quantum-chemical calculations using the "embedded cluster reference interaction site model" (EC-RISM) integral equation theory is reviewed in the context of applying it as a solvation model for calculating pressure-dependent thermodynamic and spectroscopic properties of molecules immersed in water. The methodology is based on self-consistent calculations of electronic and solvation structure around dissolved molecules where pressure enters the equations via an appropriately chosen solvent response function and the pure solvent density. Besides specification of a dispersion-repulsion force field for solute-solvent interactions, the EC-RISM approach derives the electrostatic interaction contributions directly from the wave function. We further develop and apply the method to a variety of benchmark cases for which computational or experimental reference data are either available in the literature or are generated specifically for this purpose in this work. Starting with an enhancement to predict hydration free energies at non-ambient pressures, which is the basis for pressure-dependent molecular population estimation, we demonstrate the performance on the calculation of the autoionization constant of water. Spectroscopic problems are addressed by studying the biologically relevant small osmolyte TMAO (trimethylamine N-oxide). Pressure-dependent NMR shifts are predicted and compared to experiments taking into account proper computational referencing methods that extend earlier work. The experimentally observed IR blue-shifts of certain vibrational bands of TMAO as well as of the cyanide anion are reproduced by novel methodology that allows for weighing equilibrium and non-equilibrium solvent relaxation effects. Taken together, the model systems investigated allow for an assessment of the reliability of the EC-RISM approach for studying pressure-dependent biophysical processes.

Subject(s)

Models, Chemical , Magnetic Resonance Spectroscopy , Methylamines/chemical synthesis , Methylamines/chemistry , Molecular Dynamics Simulation , Pressure , Quantum Theory

pK_a calculations for tautomerizable and conformationally flexible molecules: partition function vs. state transition approach.

Tielker, Nicolas; Eberlein, Lukas; Chodun, Christian; Güssregen, Stefan; Kast, Stefan M.

J Mol Model ; 25(5): 139, 2019 Apr 30.

Article in English | MEDLINE | ID: mdl-31041535

ABSTRACT

Calculations of acidities of molecules with multiple tautomeric and/or conformational states require adequate treatment of the relative energetics of accessible states accompanied by a statistical-mechanical formulation of their contribution to the macroscopic pKa value. Here, we demonstrate rigorously the formal equivalence of two such approaches: a partition function treatment and statistics over transitions between molecular tautomeric and conformational states in the limit of a theory that does not require adjustment by empirical parameters correcting energetic values. However, for a frequently employed correction scheme, linear scaling of (free) energies and regression with respect to reference data taking an additive constant into account, this equivalence breaks down if more than one acid or base state is involved. The consequences of the resulting inconsistency are discussed on our datasets developed for aqueous pKa predictions during the recent SAMPL6 challenge, where molecular state energetics were computed based on the "embedded cluster reference interaction site model" (EC-RISM). This method couples integral equation theory as a solvation model to quantum-chemical calculations and yielded a test set root mean square error of 1.1 pK units from a partition function ansatz. For all practical purposes, the present results indicate that a state transition approach yields comparable accuracy despite the formal theoretical inconsistency, and that an additive regression intercept, which is strictly constant in the limit of large compound mass only, is a valid approximation. Graphical abstract Embedded cluster reference interaction site model-derived vs. experimental pKa for the test set calculated with either the partition function (blue) or the state transition approach (red), using m as a free parameter.

The SAMPL6 challenge on predicting aqueous pK_a values from EC-RISM theory.

Tielker, Nicolas; Eberlein, Lukas; Güssregen, Stefan; Kast, Stefan M.

J Comput Aided Mol Des ; 32(10): 1151-1163, 2018 10.

Article in English | MEDLINE | ID: mdl-30073500

ABSTRACT

The "embedded cluster reference interaction site model" (EC-RISM) integral equation theory is applied to the problem of predicting aqueous pKa values for drug-like molecules based on an ensemble of tautomers. EC-RISM is based on self-consistent calculations of a solute's electronic structure and the distribution function of surrounding water. Following-up on the workflow developed after the SAMPL5 challenge on cyclohexane-water distribution coefficients we extended and improved the methodology by taking into account exact electrostatic solute-solvent interactions taken from the wave function in solution. As before, the model is calibrated against Gibbs energies of hydration from the "Minnesota Solvation Database" and a public dataset of acidity constants of organic acids and bases by adjusting in total 4 parameters, among which only 3 are relevant for predicting pKa values. While the best-performing training model yields a root-mean-square error (RMSE) of 1 pK unit, the corresponding test set prediction on the full SAMPL6 dataset of macroscopic pKa values using the same level of theory exhibits slightly larger error (1.7 pK units) than the best test set model submitted (1.7 pK units for corresponding training set vs. test set performance of 1.6). Post-submission analysis revealed a number of physical optimization options regarding the numerical treatment of electrostatic interactions and conformational sampling. While the experimental test set data revealed after submission was not used for reparametrizing the methodology, the best physically optimized models consequentially result in RMSEs of 1.5 if only improved electrostatic interactions are considered and of 1.1 if, in addition, conformational sampling accounts for quantum-chemically derived rankings. We conclude that these numbers are probably near the ultimate accuracy achievable with the simple 3-parameter model using a single or the two best-ranking conformations per tautomer or microstate. Finally, relations of the present macrostate approach to microstate pKa results are discussed and some illustrative results for microstate populations are presented.

Subject(s)

Hydrocarbons, Cyclic/chemistry , Models, Chemical , Computer Simulation , Databases, Chemical , Models, Theoretical , Molecular Conformation , Solutions/chemistry , Solvents/chemistry , Static Electricity , Thermodynamics , Water/chemistry

The SAMPL5 challenge for embedded-cluster integral equation theory: solvation free energies, aqueous pK _a, and cyclohexane-water log D.

Tielker, Nicolas; Tomazic, Daniel; Heil, Jochen; Kloss, Thomas; Ehrhart, Sebastian; Güssregen, Stefan; Schmidt, K Friedemann; Kast, Stefan M.

J Comput Aided Mol Des ; 30(11): 1035-1044, 2016 11.

Article in English | MEDLINE | ID: mdl-27554666

ABSTRACT

We predict cyclohexane-water distribution coefficients (log D 7.4) for drug-like molecules taken from the SAMPL5 blind prediction challenge by the "embedded cluster reference interaction site model" (EC-RISM) integral equation theory. This task involves the coupled problem of predicting both partition coefficients (log P) of neutral species between the solvents and aqueous acidity constants (pK a) in order to account for a change of protonation states. The first issue is addressed by calibrating an EC-RISM-based model for solvation free energies derived from the "Minnesota Solvation Database" (MNSOL) for both water and cyclohexane utilizing a correction based on the partial molar volume, yielding a root mean square error (RMSE) of 2.4 kcal mol-1 for water and 0.8-0.9 kcal mol-1 for cyclohexane depending on the parametrization. The second one is treated by employing on one hand an empirical pK a model (MoKa) and, on the other hand, an EC-RISM-derived regression of published acidity constants (RMSE of 1.5 for a single model covering acids and bases). In total, at most 8 adjustable parameters are necessary (2-3 for each solvent and two for the pK a) for training solvation and acidity models. Applying the final models to the log D 7.4 dataset corresponds to evaluating an independent test set comprising other, composite observables, yielding, for different cyclohexane parametrizations, 2.0-2.1 for the RMSE with the first and 2.2-2.8 with the combined first and second SAMPL5 data set batches. Notably, a pure log P model (assuming neutral species only) performs statistically similarly for these particular compounds. The nature of the approximations and possible perspectives for future developments are discussed.

Subject(s)

Computer Simulation , Cyclohexanes/chemistry , Pharmaceutical Preparations/chemistry , Water/chemistry , Models, Chemical , Molecular Structure , Quantum Theory , Solubility , Solvents/chemistry , Thermodynamics

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL