Search | VHL Regional Portal

Predicting absolute aqueous solubility by applying a machine learning model for an artificially liquid-state as proxy for the solid-state.

Gheta, Sadra Kashef Ol; Bonin, Anne; Gerlach, Thomas; Göller, Andreas H.

J Comput Aided Mol Des ; 37(12): 765-789, 2023 12.

Article in English | MEDLINE | ID: mdl-37878216

ABSTRACT

In this study, we use machine learning algorithms with QM-derived COSMO-RS descriptors, along with Morgan fingerprints, to predict the absolute solubility of drug-like compounds. The QM-derived descriptors account for the molecular properties of the solute, i.e., the solute-solute interactions in an artificial-liquid-state (super-cooled liquid), and the solute-solvent interactions in solution. We employ two main approaches to predict solubility: (i) a hypothetical pathway that involves melting the solute at room temperature T = T¯ ([Formula: see text]) and mixing the artificially liquid solute into the solvent ([Formula: see text]). In this approach [Formula: see text] is predicted using machine learning models, and the [Formula: see text] is obtained from COSMO-RS calculations; (ii) direct solubility prediction using machine learning algorithms. The models were trained on a large number of Bayer in-house compounds for which water solubility data is available at physiological pH of 6.5 and ambient temperature. We also evaluated our models using external datasets from a solubility challenge. Our models present great improvements compared to the absolute solubility prediction with the QSAR model for the artificial liquid state as implemented in the COSMOtherm software, for both in-house and external datasets. We are furthermore able to demonstrate the superiority of QM-derived descriptors compared to cheminformatics descriptors. We finally present low-cost alternative models using fragment-based COSMOquick calculations with only marginal reduction in the quality of predicted solubility.

Subject(s)

Models, Chemical , Water , Solubility , Water/chemistry , Machine Learning , Solvents/chemistry

pH-dependent solubility prediction for optimized drug absorption and compound uptake by plants.

Bonin, Anne; Montanari, Floriane; Niederführ, Sebastian; Göller, Andreas H.

J Comput Aided Mol Des ; 37(3): 129-145, 2023 03.

Article in English | MEDLINE | ID: mdl-36797399

ABSTRACT

Aqueous solubility is the most important physicochemical property for agrochemical and drug candidates and a prerequisite for uptake, distribution, transport, and finally the bioavailability in living species. We here present the first-ever direct machine learning models for pH-dependent solubility in water. For this, we combined almost 300000 data points from 11 solubility assays performed over 24 years and over one million data points from lipophilicity and melting point experiments. Data were split into three pH-classes - acidic, neutral and basic - , representing the conditions of stomach and intestinal tract for animals and humans, and phloem and xylem for plants. We find that multi-task neural networks using ECFP-6 fingerprints outperform baseline random forests and single-task neural networks on the individual tasks. Our final model with three solubility tasks using the pH-class combined data from different assays and five helper tasks results in root mean square errors of 0.56 log units overall (acidic 0.61; neutral 0.52; basic 0.54) and Spearman rank correlations of 0.83 (acidic 0.78; neutral 0.86; basic 0.86), making it a valuable tool for profiling of compounds in pharmaceutical and agrochemical research. The model allows for the prediction of compound pH profiles with mean and median RMSE per molecule of 0.62 and 0.56 log units.

Subject(s)

Neural Networks, Computer , Water , Humans , Animals , Solubility , Water/chemistry , Machine Learning , Hydrogen-Ion Concentration , Pharmaceutical Preparations

Ensemble completeness in conformer sampling: the case of small macrocycles.

Seep, Lea; Bonin, Anne; Meier, Katharina; Diedam, Holger; Göller, Andreas H.

J Cheminform ; 13(1): 55, 2021 Jul 29.

Article in English | MEDLINE | ID: mdl-34325738

ABSTRACT

In this study we compare the three algorithms for the generation of conformer ensembles Biovia BEST, Schrödinger Prime macrocycle sampling (PMM) and Conformator (CONF) form the University of Hamburg, with ensembles derived for exhaustive molecular dynamics simulations applied to a dataset of 7 small macrocycles in two charge states and three solvents. Ensemble completeness is a prerequisite to allow for the selection of relevant diverse conformers for many applications in computational chemistry. We apply conformation maps using principal component analysis based on ring torsions. Our major finding critical for all applications of conformer ensembles in any computational study is that maps derived from MD with explicit solvent are significantly distinct between macrocycles, charge states and solvents, whereas the maps for post-optimized conformers using implicit solvent models from all generator algorithms are very similar independent of the solvent. We apply three metrics for the quantification of the relative covered ensemble space, namely cluster overlap, variance statistics, and a novel metric, Mahalanobis distance, showing that post-optimized MD ensembles cover a significantly larger conformational space than the generator ensembles, with the ranking PMM > BEST >> CONF. Furthermore, we find that the distributions of 3D polar surface areas are very similar for all macrocycles independent of charge state and solvent, except for the smaller and more strained compound 7, and that there is also no obvious correlation between 3D PSA and intramolecular hydrogen bond count distributions.

Bayer's in silico ADMET platform: a journey of machine learning over the past two decades.

Göller, Andreas H; Kuhnke, Lara; Montanari, Floriane; Bonin, Anne; Schneckener, Sebastian; Ter Laak, Antonius; Wichard, Jörg; Lobell, Mario; Hillisch, Alexander.

Drug Discov Today ; 25(9): 1702-1709, 2020 09.

Article in English | MEDLINE | ID: mdl-32652309

ABSTRACT

Over the past two decades, an in silico absorption, distribution, metabolism, and excretion (ADMET) platform has been created at Bayer Pharma with the goal to generate models for a variety of pharmacokinetic and physicochemical endpoints in early drug discovery. These tools are accessible to all scientists within the company and can be a useful in assisting with the selection and design of novel leads, as well as the process of lead optimization. Here. we discuss the development of machine-learning (ML) approaches with special emphasis on data, descriptors, and algorithms. We show that high company internal data quality and tailored descriptors, as well as a thorough understanding of the experimental endpoints, are essential to the utility of our models. We discuss the recent impact of deep neural networks and show selected application examples.

Subject(s)

Machine Learning , Pharmacokinetics , Animals , Computer Simulation , Humans , Intestinal Absorption , Models, Theoretical , Pharmaceutical Preparations/metabolism

Central energy metabolism remains robust in acute steatotic hepatocytes challenged by a high free fatty acid load.

Niklas, Jens; Bonin, Anne; Mangin, Stefanie; Bucher, Joachim; Kopacz, Stephanie; Matz-Soja, Madlen; Thiel, Carlo; Gebhardt, Rolf; Hofmann, Ute; Mauch, Klaus.

BMB Rep ; 45(7): 396-401, 2012 Jul.

Article in English | MEDLINE | ID: mdl-22831974

ABSTRACT

Overnutrition is one of the major causes of non-alcoholic fatty liver disease (NAFLD). NAFLD is characterized by an accumulation of lipids (triglycerides) in hepatocytes and is often accompanied by high plasma levels of free fatty acids (FFA). In this study, we compared the energy metabolism in acute steatotic and non-steatotic primary mouse hepatocytes. Acute steatosis was induced by pre-incubation with high concentrations of oleate and palmitate. Labeling experiments were conducted using [U-(13)C(5),U-(15)N(2)] glutamine. Metabolite concentrations and mass isotopomer distributions of intracellular metabolites were measured and applied for metabolic flux estimation using transient 13C metabolic flux analysis. FFAs were efficiently taken up and almost completely incorporated into triglycerides (TAGs). In spite of high FFA uptake rates and the high synthesis rate of TAGs, central energy metabolism was not significantly changed in acute steatotic cells. Fatty acid ß-oxidation does not significantly contribute to the detoxification of FFAs under the applied conditions.

Subject(s)

Energy Metabolism , Fatty Acids, Nonesterified/administration & dosage , Fatty Liver/metabolism , Animals , Humans

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL