Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
ArXiv ; 2024 Jan 08.
Artigo em Inglês | MEDLINE | ID: mdl-38259348

RESUMO

Protein design often begins with knowledge of a desired function from a motif which motif-scaffolding aims to construct a functional protein around. Recently, generative models have achieved breakthrough success in designing scaffolds for a diverse range of motifs. However, the generated scaffolds tend to lack structural diversity, which can hinder success in wet-lab validation. In this work, we extend FrameFlow, an SE(3) flow matching model for protein backbone generation, to perform motif-scaffolding with two complementary approaches. The first is motif amortization, in which FrameFlow is trained with the motif as input using a data augmentation strategy. The second is motif guidance, which performs scaffolding using an estimate of the conditional score from FrameFlow, and requires no additional training. Both approaches achieve an equivalent or higher success rate than previous state-of-the-art methods, with 2.5 times more structurally diverse scaffolds. Code: https://github.com/microsoft/frame-flow.

2.
Nat Commun ; 14(1): 6651, 2023 10 31.
Artigo em Inglês | MEDLINE | ID: mdl-37907461

RESUMO

The lead optimization process in drug discovery campaigns is an arduous endeavour where the input of many medicinal chemists is weighed in order to reach a desired molecular property profile. Building the expertise to successfully drive such projects collaboratively is a very time-consuming process that typically spans many years within a chemist's career. In this work we aim to replicate this process by applying artificial intelligence learning-to-rank techniques on feedback that was obtained from 35 chemists at Novartis over the course of several months. We exemplify the usefulness of the learned proxies in routine tasks such as compound prioritization, motif rationalization, and biased de novo drug design. Annotated response data is provided, and developed models and code made available through a permissive open-source license.


Assuntos
Inteligência Artificial , Química Farmacêutica , Química Farmacêutica/métodos , Intuição , Descoberta de Drogas/métodos , Desenho de Fármacos , Aprendizado de Máquina
3.
J Cheminform ; 15(1): 67, 2023 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-37491407

RESUMO

Explainable machine learning is increasingly used in drug discovery to help rationalize compound property predictions. Feature attribution techniques are popular choices to identify which molecular substructures are responsible for a predicted property change. However, established molecular feature attribution methods have so far displayed low performance for popular deep learning algorithms such as graph neural networks (GNNs), especially when compared with simpler modeling alternatives such as random forests coupled with atom masking. To mitigate this problem, a modification of the regression objective for GNNs is proposed to specifically account for common core structures between pairs of molecules. The presented approach shows higher accuracy on a recently-proposed explainability benchmark. This methodology has the potential to assist with model explainability in drug discovery pipelines, particularly in lead optimization efforts where specific chemical series are investigated.

4.
Sci Data ; 9(1): 273, 2022 06 07.
Artigo em Inglês | MEDLINE | ID: mdl-35672335

RESUMO

Machine learning approaches in drug discovery, as well as in other areas of the chemical sciences, benefit from curated datasets of physical molecular properties. However, there currently is a lack of data collections featuring large bioactive molecules alongside first-principle quantum chemical information. The open-access QMugs (Quantum-Mechanical Properties of Drug-like Molecules) dataset fills this void. The QMugs collection comprises quantum mechanical properties of more than 665 k biologically and pharmacologically relevant molecules extracted from the ChEMBL database, totaling ~2 M conformers. QMugs contains optimized molecular geometries and thermodynamic data obtained via the semi-empirical method GFN2-xTB. Atomic and molecular properties are provided on both the GFN2-xTB and on the density-functional levels of theory (DFT, ωB97X-D/def2-SVP). QMugs features molecules of significantly larger size than previously-reported collections and comprises their respective quantum mechanical wave functions, including DFT density and orbital matrices. This dataset is intended to facilitate the development of models that learn from molecular data on different levels of theory while also providing insight into the corresponding relationships between molecular structure and biological activity.


Assuntos
Descoberta de Drogas , Aprendizado de Máquina , Termodinâmica
5.
Phys Chem Chem Phys ; 24(18): 10775-10783, 2022 May 11.
Artigo em Inglês | MEDLINE | ID: mdl-35470831

RESUMO

Many molecular design tasks benefit from fast and accurate calculations of quantum-mechanical (QM) properties. However, the computational cost of QM methods applied to drug-like molecules currently renders large-scale applications of quantum chemistry challenging. Aiming to mitigate this problem, we developed DelFTa, an open-source toolbox for the prediction of electronic properties of drug-like molecules at the density functional (DFT) level of theory, using Δ-machine-learning. Δ-Learning corrects the prediction error (Δ) of a fast but inaccurate property calculation. DelFTa employs state-of-the-art three-dimensional message-passing neural networks trained on a large dataset of QM properties. It provides access to a wide array of quantum observables on the molecular, atomic and bond levels by predicting approximations to DFT values from a low-cost semiempirical baseline. Δ-Learning outperformed its direct-learning counterpart for most of the considered QM endpoints. The results suggest that predictions for non-covalent intra- and intermolecular interactions can be extrapolated to larger biomolecular systems. The software is fully open-sourced and features documented command-line and Python APIs.


Assuntos
Química Farmacêutica , Teoria Quântica , Aprendizado de Máquina , Redes Neurais de Computação , Software
6.
J Chem Inf Model ; 62(2): 274-283, 2022 01 24.
Artigo em Inglês | MEDLINE | ID: mdl-35019265

RESUMO

Feature attribution techniques are popular choices within the explainable artificial intelligence toolbox, as they can help elucidate which parts of the provided inputs used by an underlying supervised-learning method are considered relevant for a specific prediction. In the context of molecular design, these approaches typically involve the coloring of molecular graphs, whose presentation to medicinal chemists can be useful for making a decision of which compounds to synthesize or prioritize. The consistency of the highlighted moieties alongside expert background knowledge is expected to contribute to the understanding of machine-learning models in drug design. Quantitative evaluation of such coloring approaches, however, has so far been limited to substructure identification tasks. We here present an approach that is based on maximum common substructure algorithms applied to experimentally-determined activity cliffs. Using the proposed benchmark, we found that molecule coloring approaches in conjunction with classical machine-learning models tend to outperform more modern, graph-neural-network alternatives. The provided benchmark data are fully open sourced, which we hope will facilitate the testing of newly developed molecular feature attribution techniques.


Assuntos
Inteligência Artificial , Benchmarking , Algoritmos , Aprendizado de Máquina , Redes Neurais de Computação
7.
J Chem Inf Model ; 62(2): 225-231, 2022 01 24.
Artigo em Inglês | MEDLINE | ID: mdl-34978201

RESUMO

Deep learning has been successfully applied to structure-based protein-ligand affinity prediction, yet the black box nature of these models raises some questions. In a previous study, we presented KDEEP, a convolutional neural network that predicted the binding affinity of a given protein-ligand complex while reaching state-of-the-art performance. However, it was unclear what this model was learning. In this work, we present a new application to visualize the contribution of each input atom to the prediction made by the convolutional neural network, aiding in the interpretability of such predictions. The results suggest that KDEEP is able to learn meaningful chemistry signals from the data, but it has also exposed the inaccuracies of the current model, serving as a guideline for further optimization of our prediction tools.


Assuntos
Redes Neurais de Computação , Proteínas , Ligantes , Proteínas/química
8.
Expert Opin Drug Discov ; 16(9): 949-959, 2021 09.
Artigo em Inglês | MEDLINE | ID: mdl-33779453

RESUMO

Introduction: Artificial intelligence (AI) has inspired computer-aided drug discovery. The widespread adoption of machine learning, in particular deep learning, in multiple scientific disciplines, and the advances in computing hardware and software, among other factors, continue to fuel this development. Much of the initial skepticism regarding applications of AI in pharmaceutical discovery has started to vanish, consequently benefitting medicinal chemistry.Areas covered: The current status of AI in chemoinformatics is reviewed. The topics discussed herein include quantitative structure-activity/property relationship and structure-based modeling, de novo molecular design, and chemical synthesis prediction. Advantages and limitations of current deep learning applications are highlighted, together with a perspective on next-generation AI for drug discovery.Expert opinion: Deep learning-based approaches have only begun to address some fundamental problems in drug discovery. Certain methodological advances, such as message-passing models, spatial-symmetry-preserving networks, hybrid de novo design, and other innovative machine learning paradigms, will likely become commonplace and help address some of the most challenging questions. Open data sharing and model development will play a central role in the advancement of drug discovery with AI.


Assuntos
Inteligência Artificial , Descoberta de Drogas , Desenho de Fármacos , Humanos , Aprendizado de Máquina , Relação Quantitativa Estrutura-Atividade
9.
J Chem Inf Model ; 61(3): 1083-1094, 2021 03 22.
Artigo em Inglês | MEDLINE | ID: mdl-33629843

RESUMO

Graph neural networks are able to solve certain drug discovery tasks such as molecular property prediction and de novo molecule generation. However, these models are considered "black-box" and "hard-to-debug". This study aimed to improve modeling transparency for rational molecular design by applying the integrated gradients explainable artificial intelligence (XAI) approach for graph neural network models. Models were trained for predicting plasma protein binding, hERG channel inhibition, passive permeability, and cytochrome P450 inhibition. The proposed methodology highlighted molecular features and structural elements that are in agreement with known pharmacophore motifs, correctly identified property cliffs, and provided insights into unspecific ligand-target interactions. The developed XAI approach is fully open-sourced and can be used by practitioners to train new models on other clinically relevant endpoints.


Assuntos
Inteligência Artificial , Redes Neurais de Computação , Descoberta de Drogas , Ligantes
10.
Molecules ; 25(11)2020 May 27.
Artigo em Inglês | MEDLINE | ID: mdl-32471211

RESUMO

While a plethora of different protein-ligand docking protocols have been developed over the past twenty years, their performances greatly depend on the provided input protein-ligand pair. In this study, we developed a machine-learning model that uses a combination of convolutional and fully connected neural networks for the task of predicting the performance of several popular docking protocols given a protein structure and a small compound. We also rigorously evaluated the performance of our model using a widely available database of protein-ligand complexes and different types of data splits. We further open-source all code related to this study so that potential users can make informed selections on which protocol is best suited for their particular protein-ligand pair.


Assuntos
Aprendizado Profundo , Aprendizado de Máquina , Quimioinformática , Bases de Dados de Proteínas , Simulação de Acoplamento Molecular
11.
Chem Sci ; 10(47): 10911-10918, 2019 Dec 21.
Artigo em Inglês | MEDLINE | ID: mdl-32190246

RESUMO

The capability to rank different potential drug molecules against a protein target for potency has always been a fundamental challenge in computational chemistry due to its importance in drug design. While several simulation-based methodologies exist, they are hard to use prospectively and thus predicting potency in lead optimization campaigns remains an open challenge. Here we present the first machine learning approach specifically tailored for ranking congeneric series based on deep 3D-convolutional neural networks. Furthermore we prove its effectiveness by blindly testing it on datasets provided by Janssen, Pfizer and Biogen totalling over 3246 ligands and 13 targets as well as several well-known openly available sets, representing one the largest evaluations ever performed. We also performed online learning simulations of lead optimization using the approach in a predictive manner obtaining significant advantage over experimental choice. We believe that the evaluation performed in this study is strong evidence of the usefulness of a modern deep learning model in lead optimization pipelines against more expensive simulation-based alternatives.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...