Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
J Chem Inf Model ; 63(14): 4253-4265, 2023 07 24.
Artigo em Inglês | MEDLINE | ID: mdl-37405398

RESUMO

The past decade has seen a number of impressive developments in predictive chemistry and reaction informatics driven by machine learning applications to computer-aided synthesis planning. While many of these developments have been made even with relatively small, bespoke data sets, in order to advance the role of AI in the field at scale, there must be significant improvements in the reporting of reaction data. Currently, the majority of publicly available data is reported in an unstructured format and heavily imbalanced toward high-yielding reactions, which influences the types of models that can be successfully trained. In this Perspective, we analyze several data curation and sharing initiatives that have seen success in chemistry and molecular biology. We discuss several factors that have contributed to their success and how we can take lessons from these case studies and apply them to reaction data. Finally, we spotlight the Open Reaction Database and summarize key actions the community can take toward making reaction data more findable, accessible, interoperable, and reusable (FAIR), including the use of mandates from funding agencies and publishers.


Assuntos
Curadoria de Dados , Informática , Bases de Dados Factuais , Disseminação de Informação
2.
J Med Chem ; 65(10): 7073-7087, 2022 05 26.
Artigo em Inglês | MEDLINE | ID: mdl-35511951

RESUMO

One application area of computational methods in drug discovery is the automated design of small molecules. Despite the large number of publications describing methods and their application in both retrospective and prospective studies, there is a lack of agreement on terminology and key attributes to distinguish these various systems. We introduce Automated Chemical Design (ACD) Levels to clearly define the level of autonomy along the axes of ideation and decision making. To fully illustrate this framework, we provide literature exemplars and place some notable methods and applications into the levels. The ACD framework provides a common language for describing automated small molecule design systems and enables medicinal chemists to better understand and evaluate such systems.


Assuntos
Descoberta de Drogas , Descoberta de Drogas/métodos , Estudos Prospectivos , Estudos Retrospectivos
3.
J Am Chem Soc ; 143(45): 18820-18826, 2021 11 17.
Artigo em Inglês | MEDLINE | ID: mdl-34727496

RESUMO

Chemical reaction data in journal articles, patents, and even electronic laboratory notebooks are currently stored in various formats, often unstructured, which presents a significant barrier to downstream applications, including the training of machine-learning models. We present the Open Reaction Database (ORD), an open-access schema and infrastructure for structuring and sharing organic reaction data, including a centralized data repository. The ORD schema supports conventional and emerging technologies, from benchtop reactions to automated high-throughput experiments and flow chemistry. The data, schema, supporting code, and web-based user interfaces are all publicly available on GitHub. Our vision is that a consistent data representation and infrastructure to support data sharing will enable downstream applications that will greatly improve the state of the art with respect to computer-aided synthesis planning, reaction prediction, and other predictive chemistry tasks.

4.
mBio ; 11(6)2020 11 10.
Artigo em Inglês | MEDLINE | ID: mdl-33173007

RESUMO

Affordable and effective antiviral therapies are needed worldwide, especially against agents such as dengue virus that are endemic in underserved regions. Many antiviral compounds have been studied in cultured cells but are unsuitable for clinical applications due to pharmacokinetic profiles, side effects, or inconsistent efficacy across dengue serotypes. Such tool compounds can, however, aid in identifying clinically useful treatments. Here, computational screening (Rapid Overlay of Chemical Structures) was used to identify entries in an in silico database of safe-in-human compounds (SWEETLEAD) that display high chemical similarities to known inhibitors of dengue virus. Inhibitors of the dengue proteinase NS2B/3, the dengue capsid, and the host autophagy pathway were used as query compounds. Three FDA-approved compounds that resemble the tool molecules structurally, cause little toxicity, and display strong antiviral activity in cultured cells were selected for further analysis. Pyrimethamine (50% inhibitory concentration [IC50] = 1.2 µM), like the dengue proteinase inhibitor ARDP0006 to which it shows structural similarity, inhibited intramolecular NS2B/3 cleavage. Lack of toxicity early in infection allowed testing in mice, in which pyrimethamine also reduced viral loads. Niclosamide (IC50 = 0.28 µM), like dengue core inhibitor ST-148, affected structural components of the virion and inhibited early processes during infection. Vandetanib (IC50 = 1.6 µM), like cellular autophagy inhibitor spautin-1, blocked viral exit from cells and could be shown to extend survival in vivo Thus, three FDA-approved compounds with promising utility for repurposing to treat dengue virus infections and their potential mechanisms were identified using computational tools and minimal phenotypic screening.IMPORTANCE No antiviral therapeutics are currently available for dengue virus infections. By computationally overlaying the three-dimensional (3D) chemical structures of compounds known to inhibit dengue virus over those of compounds known to be safe in humans, we identified three FDA-approved compounds that are attractive candidates for repurposing as antivirals. We identified targets for two previously identified antiviral compounds and revealed a previously unknown potential anti-dengue compound, vandetanib. This computational approach to analyze a highly curated library of structures has the benefits of speed and cost efficiency. It also leverages mechanistic work with query compounds used in biomedical research to provide strong hypotheses for the antiviral mechanisms of the safer hit compounds. This workflow to identify compounds with known safety profiles can be expanded to any biological activity for which a small-molecule query compound has been identified, potentially expediting the translation of basic research to clinical interventions.


Assuntos
Antivirais/farmacologia , Vírus da Dengue/efeitos dos fármacos , Dengue/virologia , Animais , Bases de Dados de Produtos Farmacêuticos , Dengue/tratamento farmacológico , Vírus da Dengue/genética , Vírus da Dengue/fisiologia , Avaliação Pré-Clínica de Medicamentos , Reposicionamento de Medicamentos , Humanos , Camundongos , Camundongos Endogâmicos C57BL , Carga Viral/efeitos dos fármacos , Replicação Viral/efeitos dos fármacos
6.
Sci Rep ; 10(1): 10478, 2020 Jun 23.
Artigo em Inglês | MEDLINE | ID: mdl-32572065

RESUMO

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

7.
J Med Chem ; 63(16): 8857-8866, 2020 08 27.
Artigo em Inglês | MEDLINE | ID: mdl-32525674

RESUMO

DNA-encoded small molecule libraries (DELs) have enabled discovery of novel inhibitors for many distinct protein targets of therapeutic value. We demonstrate a new approach applying machine learning to DEL selection data by identifying active molecules from large libraries of commercial and easily synthesizable compounds. We train models using only DEL selection data and apply automated or automatable filters to the predictions. We perform a large prospective study (∼2000 compounds) across three diverse protein targets: sEH (a hydrolase), ERα (a nuclear receptor), and c-KIT (a kinase). The approach is effective, with an overall hit rate of ∼30% at 30 µM and discovery of potent compounds (IC50 < 10 nM) for every target. The system makes useful predictions even for molecules dissimilar to the original DEL, and the compounds identified are diverse, predominantly drug-like, and different from known ligands. This work demonstrates a powerful new approach to hit-finding.


Assuntos
DNA/química , Descoberta de Drogas/métodos , Redes Neurais de Computação , Bibliotecas de Moléculas Pequenas/química , Epóxido Hidrolases/antagonistas & inibidores , Receptor alfa de Estrogênio/antagonistas & inibidores , Ligantes , Inibidores de Proteínas Quinases/química , Proteínas Proto-Oncogênicas c-kit/antagonistas & inibidores
8.
Sci Rep ; 9(1): 10752, 2019 07 24.
Artigo em Inglês | MEDLINE | ID: mdl-31341196

RESUMO

We present a framework, which we call Molecule Deep Q-Networks (MolDQN), for molecule optimization by combining domain knowledge of chemistry and state-of-the-art reinforcement learning techniques (double Q-learning and randomized value functions). We directly define modifications on molecules, thereby ensuring 100% chemical validity. Further, we operate without pre-training on any dataset to avoid possible bias from the choice of that set. MolDQN achieves comparable or better performance against several other recently published algorithms for benchmark molecular optimization tasks. However, we also argue that many of these tasks are not representative of real optimization problems in drug discovery. Inspired by problems faced during medicinal chemistry lead optimization, we extend our model with multi-objective reinforcement learning, which maximizes drug-likeness while maintaining similarity to the original molecule. We further show the path through chemical space to achieve optimization for a molecule to understand how the model works.

10.
J Chem Theory Comput ; 13(11): 5255-5264, 2017 Nov 14.
Artigo em Inglês | MEDLINE | ID: mdl-28926232

RESUMO

We investigate the impact of choosing regressors and molecular representations for the construction of fast machine learning (ML) models of 13 electronic ground-state properties of organic molecules. The performance of each regressor/representation/property combination is assessed using learning curves which report out-of-sample errors as a function of training set size with up to ∼118k distinct molecules. Molecular structures and properties at the hybrid density functional theory (DFT) level of theory come from the QM9 database [ Ramakrishnan et al. Sci. Data 2014 , 1 , 140022 ] and include enthalpies and free energies of atomization, HOMO/LUMO energies and gap, dipole moment, polarizability, zero point vibrational energy, heat capacity, and the highest fundamental vibrational frequency. Various molecular representations have been studied (Coulomb matrix, bag of bonds, BAML and ECFP4, molecular graphs (MG)), as well as newly developed distribution based variants including histograms of distances (HD), angles (HDA/MARAD), and dihedrals (HDAD). Regressors include linear models (Bayesian ridge regression (BR) and linear regression with elastic net regularization (EN)), random forest (RF), kernel ridge regression (KRR), and two types of neural networks, graph convolutions (GC) and gated graph networks (GG). Out-of sample errors are strongly dependent on the choice of representation and regressor and molecular property. Electronic properties are typically best accounted for by MG and GC, while energetic properties are better described by HDAD and KRR. The specific combinations with the lowest out-of-sample errors in the ∼118k training set size limit are (free) energies and enthalpies of atomization (HDAD/KRR), HOMO/LUMO eigenvalue and gap (MG/GC), dipole moment (MG/GC), static polarizability (MG/GG), zero point vibrational energy (HDAD/KRR), heat capacity at room temperature (HDAD/KRR), and highest fundamental vibrational frequency (BAML/RF). We present numerical evidence that ML model predictions deviate from DFT (B3LYP) less than DFT (B3LYP) deviates from experiment for all properties. Furthermore, out-of-sample prediction errors with respect to hybrid DFT reference are on par with, or close to, chemical accuracy. The results suggest that ML models could be more accurate than hybrid DFT if explicitly electron correlated quantum (or experimental) data were available.

11.
J Comput Aided Mol Des ; 30(8): 609-17, 2016 08.
Artigo em Inglês | MEDLINE | ID: mdl-27624668

RESUMO

Rapid overlay of chemical structures (ROCS) is a standard tool for the calculation of 3D shape and chemical ("color") similarity. ROCS uses unweighted sums to combine many aspects of similarity, yielding parameter-free models for virtual screening. In this report, we decompose the ROCS color force field into color components and color atom overlaps, novel color similarity features that can be weighted in a system-specific manner by machine learning algorithms. In cross-validation experiments, these additional features significantly improve virtual screening performance relative to standard ROCS.


Assuntos
Desenho Assistido por Computador , Desenho de Fármacos , Aprendizado de Máquina , Algoritmos , Modelos Moleculares , Estrutura Molecular , Preparações Farmacêuticas/química
12.
J Comput Aided Mol Des ; 30(8): 595-608, 2016 08.
Artigo em Inglês | MEDLINE | ID: mdl-27558503

RESUMO

Molecular "fingerprints" encoding structural information are the workhorse of cheminformatics and machine learning in drug discovery applications. However, fingerprint representations necessarily emphasize particular aspects of the molecular structure while ignoring others, rather than allowing the model to make data-driven decisions. We describe molecular graph convolutions, a machine learning architecture for learning from undirected graphs, specifically small molecules. Graph convolutions use a simple encoding of the molecular graph-atoms, bonds, distances, etc.-which allows the model to take greater advantage of information in the graph structure. Although graph convolutions do not outperform all fingerprint-based methods, they (along with other graph-based methods) represent a new paradigm in ligand-based virtual screening with exciting opportunities for future improvement.


Assuntos
Gráficos por Computador , Desenho Assistido por Computador , Desenho de Fármacos , Aprendizado de Máquina , Redes Neurais de Computação , Ligantes , Estrutura Molecular , Preparações Farmacêuticas/química
13.
J Chem Inf Model ; 54(1): 5-15, 2014 Jan 27.
Artigo em Inglês | MEDLINE | ID: mdl-24289274

RESUMO

Molecular similarity has been effectively applied to many problems in cheminformatics and computational drug discovery, but modern methods can be prohibitively expensive for large-scale applications. The SCISSORS method rapidly approximates measures of pairwise molecular similarity such as ROCS and LINGO Tanimotos, acting as a filter to quickly reduce the size of a problem. We report an in-depth analysis of SCISSORS performance, including a mapping of the SCISSORS error distribution, benchmarking, and investigation of several algorithmic modifications. We show that SCISSORS can accurately predict multiconformer similarity and suggest a method for estimating optimal SCISSORS parameters in a data set-specific manner. These results are a useful resource for researchers seeking to incorporate SCISSORS into molecular similarity applications.


Assuntos
Bases de Dados de Compostos Químicos , Modelos Químicos , Algoritmos , Biologia Computacional , Descoberta de Drogas , Avaliação Pré-Clínica de Medicamentos , Estrutura Molecular , Interface Usuário-Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...