Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
Add more filters










Database
Language
Publication year range
1.
Commun Chem ; 7(1): 134, 2024 Jun 12.
Article in English | MEDLINE | ID: mdl-38866916

ABSTRACT

Recent advances in machine learning (ML) have led to newer model architectures including transformers (large language models, LLMs) showing state of the art results in text generation and image analysis as well as few-shot learning (FSLC) models which offer predictive power with extremely small datasets. These new architectures may offer promise, yet the 'no-free lunch' theorem suggests that no single model algorithm can outperform at all possible tasks. Here, we explore the capabilities of classical (SVR), FSLC, and transformer models (MolBART) over a range of dataset tasks and show a 'goldilocks zone' for each model type, in which dataset size and feature distribution (i.e. dataset "diversity") determines the optimal algorithm strategy. When datasets are small ( < 50 molecules), FSLC tend to outperform both classical ML and transformers. When datasets are small-to-medium sized (50-240 molecules) and diverse, transformers outperform both classical models and few-shot learning. Finally, when datasets are of larger and of sufficient size, classical models then perform the best, suggesting that the optimal model to choose likely depends on the dataset available, its size and diversity. These findings may help to answer the perennial question of which ML algorithm is to be used when faced with a new dataset.

2.
Chem Res Toxicol ; 36(2): 188-201, 2023 02 20.
Article in English | MEDLINE | ID: mdl-36737043

ABSTRACT

Acetylcholinesterase (AChE) is an important enzyme and target for human therapeutics, environmental safety, and global food supply. Inhibitors of this enzyme are also used for pest elimination and can be misused for suicide or chemical warfare. Adverse effects of AChE pesticides on nontarget organisms, such as fish, amphibians, and humans, have also occurred as a result of biomagnifications of these toxic compounds. We have exhaustively curated the public data for AChE inhibition data and developed machine learning classification models for seven different species. Each set of models were built using up to nine different algorithms for each species and Morgan fingerprints (ECFP6) with an activity cutoff of 1 µM. The human (4075 compounds) and eel (5459 compounds) consensus models predicted AChE inhibition activity using external test sets from literature data with 81% and 82% accuracy, respectively, while the reciprocal cross (76% and 82% percent accuracy) was not species-specific. In addition, we also created machine learning regression models for human and eel AChE inhibition to return a predicted IC50 value for a queried molecule. We did observe an improved species specificity in the regression models, where a human support vector regression model of human AChE inhibition (3652 compounds) predicted the IC50s of the human test set to a better extent than the eel regression model (4930 compounds) on the same test set, based on mean absolute percentage error (MAPE = 9.73% vs 13.4%). The predictive power of these models certainly benefits from increasing the chemical diversity of the training set, as evidenced by expanding our human classification model by incorporating data from the Tox21 library of compounds. Of the 10 compounds we tested that were predicted active by this expanded model, two showed >80% inhibition at 100 µM. This machine learning approach therefore offers the ability to rapidly score massive libraries of molecules against the models for AChE inhibition that can then be selected for future in vitro testing to identify potential toxins. It also enabled us to create a public website, MegaAChE, for single-molecule predictions of AChE inhibition using these models at megaache.collaborationspharma.com.


Subject(s)
Acetylcholinesterase , Cholinesterase Inhibitors , Animals , Humans , Acetylcholinesterase/chemistry , Cholinesterase Inhibitors/chemistry , Fishes , Algorithms , Machine Learning
3.
Mol Pharm ; 19(11): 4320-4332, 2022 11 07.
Article in English | MEDLINE | ID: mdl-36269563

ABSTRACT

The uptake transporter OATP1B1 (SLC01B1) is largely localized to the sinusoidal membrane of hepatocytes and is a known victim of unwanted drug-drug interactions. Computational models are useful for identifying potential substrates and/or inhibitors of clinically relevant transporters. Our goal was to generate OATP1B1 in vitro inhibition data for [3H] estrone-3-sulfate (E3S) transport in CHO cells and use it to build machine learning models to facilitate a comparison of seven different classification models (Deep learning, Adaboosted decision trees, Bernoulli naïve bayes, k-nearest neighbors (knn), random forest, support vector classifier (SVC), logistic regression (lreg), and XGBoost (xgb)] using ECFP6 fingerprints to perform 5-fold, nested cross validation. In addition, we compared models using 3D pharmacophores, simple chemical descriptors alone or plus ECFP6, as well as ECFP4 and ECFP8 fingerprints. Several machine learning algorithms (SVC, lreg, xgb, and knn) had excellent nested cross validation statistics, particularly for accuracy, AUC, and specificity. An external test set containing 207 unique compounds not in the training set demonstrated that at every threshold SVC outperformed the other algorithms based on a rank normalized score. A prospective validation test set was chosen using prediction scores from the SVC models with ECFP fingerprints and were tested in vitro with 15 of 19 compounds (84% accuracy) predicted as active (≥20% inhibition) showed inhibition. Of these compounds, six (abamectin, asiaticoside, berbamine, doramectin, mobocertinib, and umbralisib) appear to be novel inhibitors of OATP1B1 not previously reported. These validated machine learning models can now be used to make predictions for drug-drug interactions for human OATP1B1 alongside other machine learning models for important drug transporters in our MegaTrans software.


Subject(s)
Algorithms , Machine Learning , Animals , Cricetinae , Humans , Bayes Theorem , Cricetulus , Software , Support Vector Machine
4.
Drug Discov Today ; 27(11): 103351, 2022 11.
Article in English | MEDLINE | ID: mdl-36096360

ABSTRACT

DNA-encoded libraries (DELs) allow starting chemical matter to be identified in drug discovery. The volume of experimental data generated also makes DELs an attractive resource for machine learning (ML). ML allows modeling complex relationships between compounds and numerical endpoints, such as the binding to a target measured by DELs. DELs could also empower other areas of drug discovery. Here, we propose that DELs and ML could be combined to model binding to off-targets, enabling better predictive toxicology. With enough data, ML models can make accurate predictions across a vast chemical space, and they can be reused and expanded across projects. Although there are limitations, more general toxicology models could be applied earlier during drug discovery, illuminating safety liabilities at a lower cost.


Subject(s)
DNA , Small Molecule Libraries , Small Molecule Libraries/chemistry , Drug Discovery , Machine Learning
5.
Mol Pharm ; 19(2): 674-689, 2022 02 07.
Article in English | MEDLINE | ID: mdl-34964633

ABSTRACT

Tuberculosis (TB) is a major global health challenge, with approximately 1.4 million deaths per year. There is still a need to develop novel treatments for patients infected with Mycobacterium tuberculosis (Mtb). There have been many large-scale phenotypic screens that have led to the identification of thousands of new compounds. Yet, there is very limited investment in TB drug discovery which points to the need for new methods to increase the efficiency of drug discovery against Mtb. We have used machine learning approaches to learn from the public Mtb data, resulting in many data sets and models with robust enrichment and hit rates leading to the discovery of new active compounds. Recently, we have curated predominantly small-molecule Mtb data and developed new machine learning classification models with 18 886 molecules at different activity cutoffs. We now describe the further validation of these Bayesian models using a library of over 1000 molecules synthesized as part of EU-funded New Medicines for TB and More Medicines for TB programs. We highlight molecular features which are enriched in these active compounds. In addition, we provide new regression and classification models that can be used for scoring compound libraries or used to design new molecules. We have also visualized these molecules in the context of known molecular targets and identified clusters in chemical property space, which may aid in future target identification efforts. Finally, we are also making these data sets publicly available, representing a significant increase to the available Mtb inhibition data in the public domain.


Subject(s)
Mycobacterium tuberculosis , Tuberculosis , Antitubercular Agents/chemistry , Bayes Theorem , Humans , Machine Learning , Tuberculosis/drug therapy
6.
Pharm Res ; 36(9): 137, 2019 Jul 22.
Article in English | MEDLINE | ID: mdl-31332533

ABSTRACT

PURPOSE: Pitt Hopkins Syndrome (PTHS) is a rare genetic disorder caused by mutations of a specific gene, transcription factor 4 (TCF4), located on chromosome 18. PTHS results in individuals that have moderate to severe intellectual disability, with most exhibiting psychomotor delay. PTHS also exhibits features of autistic spectrum disorders, which are characterized by the impaired ability to communicate and socialize. PTHS is comorbid with a higher prevalence of epileptic seizures which can be present from birth or which commonly develop in childhood. Attenuated or absent TCF4 expression results in increased translation of peripheral ion channels Kv7.1 and Nav1.8 which triggers an increase in after-hyperpolarization and altered firing properties. METHODS: We now describe a high throughput screen (HTS) of 1280 approved drugs and machine learning models developed from this data. The ion channels were expressed in either CHO (KV7.1) or HEK293 (Nav1.8) cells and the HTS used either 86Rb+ efflux (KV7.1) or a FLIPR assay (Nav1.8). RESULTS: The HTS delivered 55 inhibitors of Kv7.1 (4.2% hit rate) and 93 inhibitors of Nav1.8 (7.2% hit rate) at a screening concentration of 10 µM. These datasets also enabled us to generate and validate Bayesian machine learning models for these ion channels. We also describe a structure activity relationship for several dihydropyridine compounds as inhibitors of Nav1.8. CONCLUSIONS: This work could lead to the potential repurposing of nicardipine or other dihydropyridine calcium channel antagonists as potential treatments for PTHS acting via Nav1.8, as there are currently no approved treatments for this rare disorder.


Subject(s)
Dihydropyridines/pharmacology , Drug Repositioning/methods , Hyperventilation/drug therapy , Intellectual Disability/drug therapy , KCNQ1 Potassium Channel/antagonists & inhibitors , NAV1.8 Voltage-Gated Sodium Channel/metabolism , Potassium Channel Blockers/pharmacology , Sodium Channel Blockers/pharmacology , Voltage-Gated Sodium Channel Blockers/pharmacology , Animals , Bayes Theorem , CHO Cells , Cricetulus , Dihydropyridines/chemistry , Facies , HEK293 Cells , Humans , KCNQ1 Potassium Channel/metabolism , Machine Learning , Potassium Channel Blockers/chemistry , Small Molecule Libraries/chemistry , Sodium Channel Blockers/chemistry , Structure-Activity Relationship , Voltage-Gated Sodium Channel Blockers/chemistry
SELECTION OF CITATIONS
SEARCH DETAIL
...