Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 10 de 10
Filter
Add more filters










Publication year range
1.
Mol Pharm ; 20(10): 5052-5065, 2023 10 02.
Article in English | MEDLINE | ID: mdl-37713584

ABSTRACT

During drug discovery and development, achieving appropriate pharmacokinetics is key to establishment of the efficacy and safety of new drugs. Physiologically based pharmacokinetic (PBPK) models integrating in vitro-to-in vivo extrapolation have become an essential in silico tool to achieve this goal. In this context, the most important and probably most challenging pharmacokinetic parameter to estimate is the clearance. Recent work on high-throughput PBPK modeling during drug discovery has shown that a good estimate of the unbound intrinsic clearance (CLint,u,) is the key factor for useful PBPK application. In this work, three different machine learning-based strategies were explored to predict the rat CLint,u as the input into PBPK. Therefore, in vivo and in vitro data was collected for a total of 2639 proprietary compounds. The strategies were compared to the standard in vitro bottom-up approach. Using the well-stirred liver model to back-calculate in vivo CLint,u from in vivo rat clearance and then training a machine learning model on this CLint,u led to more accurate clearance predictions (absolute average fold error (AAFE) 3.1 in temporal cross-validation) than the bottom-up approach (AAFE 3.6-16, depending on the scaling method) and has the advantage that no experimental in vitro data is needed. However, building a machine learning model on the bias between the back-calculated in vivo CLint,u and the bottom-up scaled in vitro CLint,u also performed well. For example, using unbound hepatocyte scaling, adding the bias prediction improved the AAFE in the temporal cross-validation from 16 for bottom-up to 2.9 together with the bias prediction. Similarly, the log Pearson r2 improved from 0.1 to 0.29. Although it would still require in vitro measurement of CLint,u., using unbound scaling for the bottom-up approach, the need for correction of the fu,inc by fu,p data is circumvented. While the above-described ML models were built on all data points available per approach, it is discussed that evaluation comparison across all approaches could only be performed on a subset because ca. 75% of the molecules had missing or unquantifiable measurements of the fraction unbound in plasma or in vitro unbound intrinsic clearance, or they dropped out due to the blood-flow limitation assumed by the well-stirred model. Advantageously, by predicting CLint,u as the input into PBPK, existing workflows can be reused and the prediction of the in vivo clearance and other PK parameters can be improved.


Subject(s)
Liver , Models, Biological , Animals , Rats , Metabolic Clearance Rate , Liver/metabolism , Hepatocytes , Kinetics
2.
Chem Res Toxicol ; 36(9): 1503-1517, 2023 09 18.
Article in English | MEDLINE | ID: mdl-37584277

ABSTRACT

In silico approaches have acquired a towering role in pharmaceutical research and development, allowing laboratories all around the world to design, create, and optimize novel molecular entities with unprecedented efficiency. From a toxicological perspective, computational methods have guided the choices of medicinal chemists toward compounds displaying improved safety profiles. Even if the recent advances in the field are significant, many challenges remain active in the on-target and off-target prediction fields. Machine learning methods have shown their ability to identify molecules with safety concerns. However, they strongly depend on the abundance and diversity of data used for their training. Sharing such information among pharmaceutical companies remains extremely limited due to confidentiality reasons, but in this scenario, a recent concept named "federated learning" can help overcome such concerns. Within this framework, it is possible for companies to contribute to the training of common machine learning algorithms, using, but not sharing, their proprietary data. Very recently, Lhasa Limited organized a hackathon involving several industrial partners in order to assess the performance of their federated learning platform, called "Effiris". In this paper, we share our experience as Roche in participating in such an event, evaluating the performance of the federated algorithms and comparing them with those coming from our in-house-only machine learning models. Our aim is to highlight the advantages of federated learning and its intrinsic limitations and also suggest some points for potential improvements in the method.


Subject(s)
Algorithms , Spiders , Animals , Laboratories , Machine Learning , Pharmaceutical Preparations
3.
Sci Rep ; 12(1): 7244, 2022 05 04.
Article in English | MEDLINE | ID: mdl-35508546

ABSTRACT

Machine learning models are widely applied to predict molecular properties or the biological activity of small molecules on a specific protein. Models can be integrated in a conformal prediction (CP) framework which adds a calibration step to estimate the confidence of the predictions. CP models present the advantage of ensuring a predefined error rate under the assumption that test and calibration set are exchangeable. In cases where the test data have drifted away from the descriptor space of the training data, or where assay setups have changed, this assumption might not be fulfilled and the models are not guaranteed to be valid. In this study, the performance of internally valid CP models when applied to either newer time-split data or to external data was evaluated. In detail, temporal data drifts were analysed based on twelve datasets from the ChEMBL database. In addition, discrepancies between models trained on publicly-available data and applied to proprietary data for the liver toxicity and MNT in vivo endpoints were investigated. In most cases, a drastic decrease in the validity of the models was observed when applied to the time-split or external (holdout) test sets. To overcome the decrease in model validity, a strategy for updating the calibration set with data more similar to the holdout set was investigated. Updating the calibration set generally improved the validity, restoring it completely to its expected value in many cases. The restored validity is the first requisite for applying the CP models with confidence. However, the increased validity comes at the cost of a decrease in model efficiency, as more predictions are identified as inconclusive. This study presents a strategy to recalibrate CP models to mitigate the effects of data drifts. Updating the calibration sets without having to retrain the model has proven to be a useful approach to restore the validity of most models.


Subject(s)
Biological Assay , Machine Learning , Calibration , Molecular Conformation
4.
Environ Int ; 158: 106947, 2022 01.
Article in English | MEDLINE | ID: mdl-34717173

ABSTRACT

BACKGROUND: Exposure to environmental chemicals that interfere with normal estrogen function can lead to adverse health effects, including cancer. High-throughput screening (HTS) approaches facilitate the efficient identification and characterization of such substances. OBJECTIVES: We recently described the development of the E-Morph Assay, which measures changes at adherens junctions as a clinically-relevant phenotypic readout for estrogen receptor (ER) alpha signaling activity. Here, we describe its further development and application for automated robotic HTS. METHODS: Using the advanced E-Morph Screening Assay, we screened a substance library comprising 430 toxicologically-relevant industrial chemicals, biocides, and plant protection products to identify novel substances with estrogenic activities. Based on the primary screening data and the publicly available ToxCast dataset, we performed an insilico similarity search to identify further substances with potential estrogenic activity for follow-up hit expansion screening, and built seven insilico ER models using the conformal prediction (CP) framework to evaluate the HTS results. RESULTS: The primary and hit confirmation screens identified 27 'known' estrogenic substances with potencies correlating very well with the published ToxCast ER Agonist Score (r=+0.95). We additionally detected potential 'novel' estrogenic activities for 10 primary hit substances and for another nine out of 20 structurally similar substances from insilico predictions and follow-up hit expansion screening. The concordance of the E-Morph Screening Assay with the ToxCast ER reference data and the generated CP ER models was 71% and 73%, respectively, with a high predictivity for ER active substances of up to 87%, which is particularly important for regulatory purposes. DISCUSSION: These data provide a proof-of-concept for the combination of in vitro HTS approaches with insilico methods (similarity search, CP models) for efficient analysis of large substance libraries in order to prioritize substances with potential estrogenic activity for subsequent testing against higher tier human endpoints.


Subject(s)
Endocrine Disruptors , Biological Assay , Estrogens/toxicity , Estrone , High-Throughput Screening Assays , Humans
5.
J Chem Inf Model ; 61(7): 3255-3272, 2021 07 26.
Article in English | MEDLINE | ID: mdl-34153183

ABSTRACT

Computational methods such as machine learning approaches have a strong track record of success in predicting the outcomes of in vitro assays. In contrast, their ability to predict in vivo endpoints is more limited due to the high number of parameters and processes that may influence the outcome. Recent studies have shown that the combination of chemical and biological data can yield better models for in vivo endpoints. The ChemBioSim approach presented in this work aims to enhance the performance of conformal prediction models for in vivo endpoints by combining chemical information with (predicted) bioactivity assay outcomes. Three in vivo toxicological endpoints, capturing genotoxic (MNT), hepatic (DILI), and cardiological (DICC) issues, were selected for this study due to their high relevance for the registration and authorization of new compounds. Since the sparsity of available biological assay data is challenging for predictive modeling, predicted bioactivity descriptors were introduced instead. Thus, a machine learning model for each of the 373 collected biological assays was trained and applied on the compounds of the in vivo toxicity data sets. Besides the chemical descriptors (molecular fingerprints and physicochemical properties), these predicted bioactivities served as descriptors for the models of the three in vivo endpoints. For this study, a workflow based on a conformal prediction framework (a method for confidence estimation) built on random forest models was developed. Furthermore, the most relevant chemical and bioactivity descriptors for each in vivo endpoint were preselected with lasso models. The incorporation of bioactivity descriptors increased the mean F1 scores of the MNT model from 0.61 to 0.70 and for the DICC model from 0.72 to 0.82 while the mean efficiencies increased by roughly 0.10 for both endpoints. In contrast, for the DILI endpoint, no significant improvement in model performance was observed. Besides pure performance improvements, an analysis of the most important bioactivity features allowed detection of novel and less intuitive relationships between the predicted biological assay outcomes used as descriptors and the in vivo endpoints. This study presents how the prediction of in vivo toxicity endpoints can be improved by the incorporation of biological information-which is not necessarily captured by chemical descriptors-in an automated workflow without the need for adding experimental workload for the generation of bioactivity descriptors as predicted outcomes of bioactivity assays were utilized. All bioactivity CP models for deriving the predicted bioactivities, as well as the in vivo toxicity CP models, can be freely downloaded from https://doi.org/10.5281/zenodo.4761225.


Subject(s)
Liver , Machine Learning , Biological Assay , Molecular Conformation
6.
J Cheminform ; 13(1): 35, 2021 Apr 29.
Article in English | MEDLINE | ID: mdl-33926567

ABSTRACT

Machine learning methods are widely used in drug discovery and toxicity prediction. While showing overall good performance in cross-validation studies, their predictive power (often) drops in cases where the query samples have drifted from the training data's descriptor space. Thus, the assumption for applying machine learning algorithms, that training and test data stem from the same distribution, might not always be fulfilled. In this work, conformal prediction is used to assess the calibration of the models. Deviations from the expected error may indicate that training and test data originate from different distributions. Exemplified on the Tox21 datasets, composed of chronologically released Tox21Train, Tox21Test and Tox21Score subsets, we observed that while internally valid models could be trained using cross-validation on Tox21Train, predictions on the external Tox21Score data resulted in higher error rates than expected. To improve the prediction on the external sets, a strategy exchanging the calibration set with more recent data, such as Tox21Test, has successfully been introduced. We conclude that conformal prediction can be used to diagnose data drifts and other issues related to model calibration. The proposed improvement strategy-exchanging the calibration data only-is convenient as it does not require retraining of the underlying model.

7.
J Cheminform ; 12(1): 24, 2020 Apr 14.
Article in English | MEDLINE | ID: mdl-33431007

ABSTRACT

Risk assessment of newly synthesised chemicals is a prerequisite for regulatory approval. In this context, in silico methods have great potential to reduce time, cost, and ultimately animal testing as they make use of the ever-growing amount of available toxicity data. Here, KnowTox is presented, a novel pipeline that combines three different in silico toxicology approaches to allow for confident prediction of potentially toxic effects of query compounds, i.e. machine learning models for 88 endpoints, alerts for 919 toxic substructures, and computational support for read-across. It is mainly based on the ToxCast dataset, containing after preprocessing a sparse matrix of 7912 compounds tested against 985 endpoints. When applying machine learning models, applicability and reliability of predictions for new chemicals are of utmost importance. Therefore, first, the conformal prediction technique was deployed, comprising an additional calibration step and per definition creating internally valid predictors at a given significance level. Second, to further improve validity and information efficiency, two adaptations are suggested, exemplified at the androgen receptor antagonism endpoint. An absolute increase in validity of 23% on the in-house dataset of 534 compounds could be achieved by introducing KNNRegressor normalisation. This increase in validity comes at the cost of efficiency, which could again be improved by 20% for the initial ToxCast model by balancing the dataset during model training. Finally, the value of the developed pipeline for risk assessment is discussed using two in-house triazole molecules. Compared to a single toxicity prediction method, complementing the outputs of different approaches can have a higher impact on guiding toxicity testing and de-selecting most likely harmful development-candidate compounds early in the development process.

8.
J Cheminform ; 11(1): 29, 2019 Apr 08.
Article in English | MEDLINE | ID: mdl-30963287

ABSTRACT

Owing to the increase in freely available software and data for cheminformatics and structural bioinformatics, research for computer-aided drug design (CADD) is more and more built on modular, reproducible, and easy-to-share pipelines. While documentation for such tools is available, there are only a few freely accessible examples that teach the underlying concepts focused on CADD, especially addressing users new to the field. Here, we present TeachOpenCADD, a teaching platform developed by students for students, using open source compound and protein data as well as basic and CADD-related Python packages. We provide interactive Jupyter notebooks for central CADD topics, integrating theoretical background and practical code. TeachOpenCADD is freely available on GitHub: https://github.com/volkamerlab/TeachOpenCADD .

9.
Chem Commun (Camb) ; 51(26): 5721-4, 2015 Apr 04.
Article in English | MEDLINE | ID: mdl-25719505

ABSTRACT

We report a novel oral prodrug approach where a solubilizing polymer conjugated to the drug is designed to be released by the action of an exogenously administered agent in the intestine. A redox-sensitive self-immolating design was implemented, and the reconversion kinetics were studied for three reducible prodrugs.


Subject(s)
Prodrugs/administration & dosage , Prodrugs/chemistry , Administration, Oral , Body Fluids/chemistry , Body Fluids/metabolism , Humans , Intestinal Mucosa/metabolism , Intestines/chemistry , Kinetics , Molecular Structure , Oxidation-Reduction , Polymers/chemistry , Prodrugs/pharmacokinetics , Solubility
SELECTION OF CITATIONS
SEARCH DETAIL
...