Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 8 de 8
Filter
Add more filters










Database
Language
Publication year range
1.
J Chem Inf Model ; 63(10): 2948-2959, 2023 05 22.
Article in English | MEDLINE | ID: mdl-37125691

ABSTRACT

Predicting solubility of small molecules is a very difficult undertaking due to the lack of reliable and consistent experimental solubility data. It is well known that for a molecule in a crystal lattice to be dissolved, it must, first, dissociate from the lattice and then, second, be solvated. The melting point of a compound is proportional to the lattice energy, and the octanol-water partition coefficient (log P) is a measure of the compound's solvation efficiency. The CCDC's melting point dataset of almost one hundred thousand compounds was utilized to create widely applicable machine learning models of small molecule melting points. Using the general solubility equation, the aqueous thermodynamic solubilities of the same compounds can be predicted. The global model could be easily localized by adding additional melting point measurements for a chemical series of interest.


Subject(s)
Machine Learning , Water , Solubility , Water/chemistry , Octanols/chemistry
2.
J Chem Inf Model ; 62(10): 2446-2455, 2022 05 23.
Article in English | MEDLINE | ID: mdl-35522137

ABSTRACT

A method is presented for an ultrafast shape-based search workflow for the screening of large compound collections, i.e., those of vendors. The three-dimensional shape of a molecule dictates its biological activity by enabling the molecule to fit into binding pockets of proteins. Quite often, distinctly different chemical compounds that have similar shapes can bind in a similar way. OpenEye pioneered an algorithm for comparing shapes of molecules by overlaying them in a computer and measuring differences between a query molecule and a target molecule. Overlaying shapes is a computationally intensive process and represents a bottleneck in searching for similar molecules. More recent publications describe alternative methods of overlaying molecules, which are accomplished by comparing shape-based descriptors. These methods were implemented in the Open Drug Discovery Toolkit (ODDT) package. We utilized a combination of open-source software packages like ODDT and RDkit to implement a workflow for ultrafast conformer generation and matching that does not require storing precomputed conformers on the file system or in memory. Moreover, the generated descriptors could be optionally stored in MongoDB for performing searches in the future. To speed up the search, we created a set of indexes from the transformed shape-based descriptors. We are in the process of calculating descriptors for multiple vendors, including Enamine's "REAL" collection of 1.2 billion compounds. Currently, the shape similarity search on more than 70 million compounds takes less than 8 s! We exemplified our methodology with the screen of compounds that can act as putative TLR4 agonists. The search was based on a literature-known small-molecule TLR4 agonist series. In due course, we identified compounds with novel structural motifs that were active in mouse and human TLR4 reporter cell lines.


Subject(s)
Software , Toll-Like Receptor 4 , Algorithms , Animals , Drug Discovery , Mice , Workflow
3.
J Chem Inf Model ; 59(10): 4450-4459, 2019 10 28.
Article in English | MEDLINE | ID: mdl-31518124

ABSTRACT

Profile-quantitative structure-activity relationship (pQSAR) is a massively multitask, two-step machine learning method with unprecedented scope, accuracy, and applicability domain. In step one, a "profile" of conventional single-assay random forest regression models are trained on a very large number of biochemical and cellular pIC50 assays using Morgan 2 substructural fingerprints as compound descriptors. In step two, a panel of partial least squares (PLS) models are built using the profile of pIC50 predictions from those random forest regression models as compound descriptors (hence the name). Previously described for a panel of 728 biochemical and cellular kinase assays, we have now built an enormous pQSAR from 11 805 diverse Novartis (NVS) IC50 and EC50 assays. This large number of assays, and hence of compound descriptors for PLS, dictated reducing the profile by only including random forest regression models whose predictions correlate with the assay being modeled. The random forest regression and pQSAR models were evaluated with our "realistically novel" held-out test set, whose median average similarity to the nearest training set member across the 11 805 assays was only 0.34, comparable to the novelty of compounds actually selected from virtual screens. For the 11 805 single-assay random forest regression models, the median correlation of prediction with the experiment was only rext2 = 0.05, virtually random, and only 8% of the models achieved our standard success threshold of rext2 = 0.30. For pQSAR, the median correlation was rext2 = 0.53, comparable to four-concentration experimental IC50s, and 72% of the models met our rext2 > 0.30 standard, totaling 8558 successful models. The successful models included assays from all of the 51 annotated target subclasses, as well as 4196 phenotypic assays, indicating that pQSAR can be applied to virtually any disease area. Every month, all models are updated to include new measurements, and predictions are made for 5.5 million NVS compounds, totaling 50 billion predictions. Common uses have included virtual screening, selectivity design, toxicity and promiscuity prediction, mechanism-of-action prediction, and others. Several such actual applications are described.


Subject(s)
Drug Discovery/methods , Machine Learning , Algorithms , Biological Assay , Dose-Response Relationship, Drug , Inhibitory Concentration 50 , Logistic Models , Models, Chemical , Proteins/chemistry , Quantitative Structure-Activity Relationship
4.
J Chem Inf Model ; 57(8): 2077-2088, 2017 08 28.
Article in English | MEDLINE | ID: mdl-28651433

ABSTRACT

While conventional random forest regression (RFR) virtual screening models appear to have excellent accuracy on random held-out test sets, they prove lacking in actual practice. Analysis of 18 historical virtual screens showed that random test sets are far more similar to their training sets than are the compounds project teams actually order. A new, cluster-based "realistic" training/test set split, which mirrors the chemical novelty of real-life virtual screens, recapitulates the poor predictive power of RFR models in real projects. The original Profile-QSAR (pQSAR) method greatly broadened the domain of applicability over conventional models by using as independent variables a profile of activity predictions from all historical assays in a large protein family. However, the accuracy still fell short of experiment on realistic test sets. The improved "pQSAR 2.0" method replaces probabilities of activity from naïve Bayes categorical models at several thresholds with predicted IC50s from RFR models. Unexpectedly, the high accuracy also requires removing the RFR model for the actual assay of interest from the independent variable profile. With these improvements, pQSAR 2.0 activity predictions are now statistically comparable to medium-throughput four-concentration IC50 measurements even on the realistic test set. Beyond the yes/no activity predictions from a typical high-throughput screen (HTS) or conventional virtual screen, these semiquantitative IC50 predictions allow for predicted potency, ligand efficiency, lipophilic efficiency, and selectivity against antitargets, greatly facilitating hitlist triaging and enabling virtual screening panels such as toxicity panels and overall promiscuity predictions.


Subject(s)
Drug Evaluation, Preclinical/methods , Protein Kinase Inhibitors/chemistry , Protein Kinase Inhibitors/pharmacology , Quantitative Structure-Activity Relationship , Inhibitory Concentration 50 , Machine Learning , Regression Analysis
5.
J Chem Inf Model ; 54(2): 377-86, 2014 Feb 24.
Article in English | MEDLINE | ID: mdl-24437550

ABSTRACT

A phenotypic screen (PS) is used to identify compounds causing a desired phenotype in a complex biological system where mechanisms and targets are largely unknown. Deconvoluting the mechanism of action of actives and identification of relevant targets and pathways remains a formidable challenge. Current methods fail to use the rich information available regarding compounds and their targets in a systematic way for this deconvolution. We have developed an enrichment analysis algorithm to identify targets associated with the desired phenotype in a rigorous data-driven manner using actives and hundreds of thousands of inactives in a PS, as well as results of thousands of available legacy target-based screens in an institution. Our method quantifies association between the PS and targets while reducing sampling bias, which leads to identification of novel targets, additional chemical matter, and appropriate assays. Its use is illustrated using two examples from our laboratories: TRAIL and DNA fragmentation. Enrichment analysis of these PSs is discussed using both biological pathway analysis and known cell biology to demonstrate the value of our method. We believe this enrichment analysis method is an indispensable tool for the analysis of PSs.


Subject(s)
Drug Evaluation, Preclinical/methods , Phenotype , Algorithms , DNA Fragmentation/drug effects , High-Throughput Screening Assays , TNF-Related Apoptosis-Inducing Ligand/metabolism
6.
J Comput Chem ; 32(9): 1944-51, 2011 Jul 15.
Article in English | MEDLINE | ID: mdl-21455963

ABSTRACT

Web services are a new technology that enables to integrate applications running on different platforms by using primarily XML to enable communication among different computers over the Internet. Large number of applications was designed as stand alone systems before the concept of Web services was introduced and it is a challenge to integrate them into larger computational networks. A generally applicable method of wrapping stand alone applications into Web services was developed and is described. To test the technology, it was applied to the QikProp for DOS (Windows). Although performance of the application did not change when it was delivered as a Web service, this form of deployment had offered several advantages like simplified and centralized maintenance, smaller number of licenses, and practically no training for the end user. Because by using the described approach almost any legacy application can be wrapped as a Web service, this form of delivery may be recommended as a global alternative to traditional deployment solutions.


Subject(s)
Internet , Software , Technology
7.
J Inorg Biochem ; 93(3-4): 265-70, 2003 Jan 15.
Article in English | MEDLINE | ID: mdl-12576290

ABSTRACT

Emergence of chloroquine-resistant Plasmodium falciparum strains necessitates discovery of novel antimalarial drugs, especially if the agents can be synthesized from commercially available, inexpensive precursors via short synthetic routes. While exploring structure-activity relationships, we found a gallium(III) complex, [(1,12-bis(2-hydroxy-5-methoxybenzyl)-1,5,8,12-tetraazadodecane)-gallium(III)](+) [Ga-5-Madd](+), 1, that possessed antimalarial efficacy. Like previously reported complexes, the crystal structure of 1 revealed gallium(III) in a symmetrical octahedral environment surrounded by four secondary amine nitrogen atoms in equatorial plane and two axial oxygen atoms. In contrast to a previously reported complex, [Ga-3-Madd](+), this novel metallo-antimalarial 1 possessed modest efficacy against chloroquine-sensitive HB3 Plasmodium lines. Thus, slight variation in the positions of methoxy functionalities on the aromatic rings of the organic scaffold dramatically altered specificity thereby suggesting a targeted (e.g., transporter- or receptor-mediated) rather than non-specific (e.g., pH or other gradient-mediated) mechanism of action for these agents.


Subject(s)
Antimalarials/chemical synthesis , Gallium/chemistry , Organometallic Compounds/chemical synthesis , Plasmodium falciparum/drug effects , Amines , Animals , Antimalarials/chemistry , Antimalarials/pharmacology , Chloroquine/pharmacology , Crystallography, X-Ray , Dose-Response Relationship, Drug , Drug Resistance , Ligands , Molecular Structure , Organometallic Compounds/chemistry , Organometallic Compounds/pharmacology , Phenol
SELECTION OF CITATIONS
SEARCH DETAIL
...