Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 4 de 4
Filter
Add more filters










Database
Language
Publication year range
1.
Sci Rep ; 12(1): 21893, 2022 12 19.
Article in English | MEDLINE | ID: mdl-36535980

ABSTRACT

There is a multitude of pathological conditions that affect human health, yet we currently lack a predictive model for most diseases, and underlying mechanisms that are shared by multiple diseases are poorly understood. We leveraged baseline clinical biomarker data and long-term disease outcomes in UK Biobank to build prognostic multivariate survival models for over 200 most common diseases. We construct a similarity map between biomarker-disease hazard ratios and demonstrate broad patterns of shared similarity in biomarker profiles across the entire disease space. Further aggregation of risk profiles through density based clustering showed that biomarker-risk profiles can be partitioned into few distinct clusters with characteristic patterns representative of broad disease categories. To confirm these risk patterns we built disease co-occurrence networks in the UK Biobank and US HCUP hospitalization databases, and compared similarity in biomarker risk profiles to disease co-occurrence. We show that proximity in the biomarker-disease space is strongly related to the occurrence of disease comorbidity, suggesting biomarker profile patterns can be used for both predicting future outcomes as well as a sensitive mechanism for detecting under-diagnosed disease states.


Subject(s)
Prognosis , Humans , Biomarkers , Comorbidity , Forecasting , Proportional Hazards Models
2.
Drug Metab Dispos ; 49(2): 169-178, 2021 02.
Article in English | MEDLINE | ID: mdl-33239335

ABSTRACT

Volume of distribution at steady state (VD,ss) is one of the key pharmacokinetic parameters estimated during the drug discovery process. Despite considerable efforts to predict VD,ss, accuracy and choice of prediction methods remain a challenge, with evaluations constrained to a small set (<150) of compounds. To address these issues, a series of in silico methods for predicting human VD,ss directly from structure were evaluated using a large set of clinical compounds. Machine learning (ML) models were built to predict VD,ss directly and to predict input parameters required for mechanistic and empirical VD,ss predictions. In addition, log D, fraction unbound in plasma (fup), and blood-to-plasma partition ratio (BPR) were measured on 254 compounds to estimate the impact of measured data on predictive performance of mechanistic models. Furthermore, the impact of novel methodologies such as measuring partition (Kp) in adipocytes and myocytes (n = 189) on VD,ss predictions was also investigated. In predicting VD,ss directly from chemical structures, both mechanistic and empirical scaling using a combination of predicted rat and dog VD,ss demonstrated comparable performance (62%-71% within 3-fold). The direct ML model outperformed other in silico methods (75% within 3-fold, r 2 = 0.5, AAFE = 2.2) when built from a larger data set. Scaling to human from predicted VD,ss of either rat or dog yielded poor results (<47% within 3-fold). Measured fup and BPR improved performance of mechanistic VD,ss predictions significantly (81% within 3-fold, r 2 = 0.6, AAFE = 2.0). Adipocyte intracellular Kp showed good correlation to the VD,ss but was limited in estimating the compounds with low VD,ss SIGNIFICANCE STATEMENT: This work advances the in silico prediction of VD,ss directly from structure and with the aid of in vitro data. Rigorous and comprehensive evaluation of various methods using a large set of clinical compounds (n = 956) is presented. The scale of techniques evaluated is far beyond any previously presented. The novel data set (n = 254) generated using a single protocol for each in vitro assay reported in this study could further aid in advancing VD,ss prediction methodologies.


Subject(s)
Pharmaceutical Preparations , Pharmacokinetics , Computer Simulation , Drug Discovery , Humans , Molecular Structure , Pharmaceutical Preparations/blood , Pharmaceutical Preparations/chemistry , Tissue Distribution
3.
J Chem Inf Model ; 60(4): 1955-1968, 2020 04 27.
Article in English | MEDLINE | ID: mdl-32243153

ABSTRACT

One of the key requirements for incorporating machine learning (ML) into the drug discovery process is complete traceability and reproducibility of the model building and evaluation process. With this in mind, we have developed an end-to-end modular and extensible software pipeline for building and sharing ML models that predict key pharma-relevant parameters. The ATOM Modeling PipeLine, or AMPL, extends the functionality of the open source library DeepChem and supports an array of ML and molecular featurization tools. We have benchmarked AMPL on a large collection of pharmaceutical data sets covering a wide range of parameters. Our key findings indicate that traditional molecular fingerprints underperform other feature representation methods. We also find that data set size correlates directly with prediction performance, which points to the need to expand public data sets. Uncertainty quantification can help predict model error, but correlation with error varies considerably between data sets and model types. Our findings point to the need for an extensible pipeline that can be shared to make model building more widely accessible and reproducible. This software is open source and available at: https://github.com/ATOMconsortium/AMPL.


Subject(s)
Drug Discovery , Software , Machine Learning , Reproducibility of Results
4.
Math Biosci Eng ; 15(4): 993-1010, 2018 08 01.
Article in English | MEDLINE | ID: mdl-30380318

ABSTRACT

We apply SE-optimal design methodology to investigate optimal data collection procedures as a first step in investigating information content in ecoinformatics data sets. To illustrate ideas we use a simple phenomenological citrus red mite population model for pest dynamics. First the optimal sampling distributions for a varying number of data points are determined. We then analyze these optimal distributions by comparing the standard errors of parameter estimates corresponding to each distribution. This allows us to investigate how many data are required to have confidence in model parameter estimates in order to employ dynamical modeling to infer population dynamics. Our results suggest that a field researcher should collect at least 12 data points at the optimal times. Data collected according to this procedure along with dynamical modeling will allow us to estimate population dynamics from presence/absence-based data sets through the development of a scaling relationship. These Likert-type data sets are commonly collected by agricultural pest management consultants and are increasingly being used in ecoinformatics studies. By applying mathematical modeling with the relationship scale from the new data, we can then explore important integrated pest management questions using past and future presence/absence data sets.


Subject(s)
Pest Control/methods , Animals , Citrus/parasitology , Computer Simulation , Mathematical Concepts , Mites/pathogenicity , Models, Biological , Monte Carlo Method , Pest Control/statistics & numerical data , Plant Diseases/parasitology , Plant Diseases/prevention & control , Population Dynamics
SELECTION OF CITATIONS
SEARCH DETAIL
...