Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
J Chem Inf Model ; 64(7): 2488-2495, 2024 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-38113513

RESUMO

Deep learning methods that predict protein-ligand binding have recently been used for structure-based virtual screening. Many such models have been trained using protein-ligand complexes with known crystal structures and activities from the PDBBind data set. However, because PDBbind only includes 20K complexes, models typically fail to generalize to new targets, and model performance is on par with models trained with only ligand information. Conversely, the ChEMBL database contains a wealth of chemical activity information but includes no information about binding poses. We introduce BigBind, a data set that maps ChEMBL activity data to proteins from the CrossDocked data set. BigBind comprises 583 K ligand activities and includes 3D structures of the protein binding pockets. Additionally, we augmented the data by adding an equal number of putative inactives for each target. Using this data, we developed Banana (basic neural network for binding affinity), a neural network-based model to classify active from inactive compounds, defined by a 10 µM cutoff. Our model achieved an AUC of 0.72 on BigBind's test set, while a ligand-only model achieved an AUC of 0.59. Furthermore, Banana achieved competitive performance on the LIT-PCBA benchmark (median EF1% 1.81) while running 16,000 times faster than molecular docking with Gnina. We suggest that Banana, as well as other models trained on this data set, will significantly improve the outcomes of prospective virtual screening tasks.


Assuntos
Proteínas , Ubiquitina-Proteína Ligases , Simulação de Acoplamento Molecular , Ligantes , Estudos Prospectivos , Proteínas/química , Ligação Proteica , Ubiquitina-Proteína Ligases/metabolismo
2.
ACS Omega ; 8(44): 41680-41688, 2023 Nov 07.
Artigo em Inglês | MEDLINE | ID: mdl-37970017

RESUMO

The success of machine learning is, in part, due to a large volume of data available to train models. However, the amount of training data for structure-based molecular property prediction remains limited. The previously described CrossDocked2020 data set expanded the available training data for binding pose classification in a molecular docking setting but did not address expanding the amount of receptor-ligand binding affinity data. We present experiments demonstrating that imputing binding affinity labels for complexes without experimentally determined binding affinities is a viable approach to expanding training data for structure-based models of receptor-ligand binding affinity. In particular, we demonstrate that utilizing imputed labels from a convolutional neural network trained only on the affinity data present in CrossDocked2020 results in a small improvement in the binding affinity regression performance, despite the additional sources of noise that such imputed labels add to the training data. The code, data splits, and imputation labels utilized in this paper are freely available at https://github.com/francoep/ImputationPaper.

3.
Exp Eye Res ; 213: 108861, 2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-34822853

RESUMO

Aberrant angiogenesis lies at the heart of a wide range of ocular pathologies such as proliferative diabetic retinopathy, wet age-related macular degeneration and retinopathy of prematurity. This study explores the anti-angiogenic activity of a novel small molecule investigative compound capable of inhibiting profilin1-actin interaction recently identified by our group. We demonstrate that our compound is capable of inhibiting migration, proliferation and angiogenic activity of microvascular endothelial cells in vitro as well as choroidal neovascularization (CNV) ex vivo. In mouse model of laser-injury induced CNV, intravitreal administration of this compound diminishes sub-retinal neovascularization. Finally, our preliminary structure-activity relationship study (SAR) demonstrates that this small molecule compound is amenable to improvement in biological activity through structural modifications.


Assuntos
Inibidores da Angiogênese/uso terapêutico , Neovascularização de Coroide/tratamento farmacológico , Neovascularização Retiniana/tratamento farmacológico , Actinas/antagonistas & inibidores , Animais , Linhagem Celular , Movimento Celular/efeitos dos fármacos , Proliferação de Células/efeitos dos fármacos , Neovascularização de Coroide/metabolismo , Modelos Animais de Doenças , Células Endoteliais/efeitos dos fármacos , Humanos , Injeções Intravítreas , Camundongos , Camundongos Endogâmicos C57BL , Profilinas/antagonistas & inibidores , Neovascularização Retiniana/metabolismo , Vasos Retinianos/citologia , Fator A de Crescimento do Endotélio Vascular/antagonistas & inibidores , Degeneração Macular Exsudativa/tratamento farmacológico , Degeneração Macular Exsudativa/metabolismo
5.
J Cheminform ; 13(1): 43, 2021 Jun 09.
Artigo em Inglês | MEDLINE | ID: mdl-34108002

RESUMO

Molecular docking computationally predicts the conformation of a small molecule when binding to a receptor. Scoring functions are a vital piece of any molecular docking pipeline as they determine the fitness of sampled poses. Here we describe and evaluate the 1.0 release of the Gnina docking software, which utilizes an ensemble of convolutional neural networks (CNNs) as a scoring function. We also explore an array of parameter values for Gnina 1.0 to optimize docking performance and computational cost. Docking performance, as evaluated by the percentage of targets where the top pose is better than 2Å root mean square deviation (Top1), is compared to AutoDock Vina scoring when utilizing explicitly defined binding pockets or whole protein docking. GNINA, utilizing a CNN scoring function to rescore the output poses, outperforms AutoDock Vina scoring on redocking and cross-docking tasks when the binding pocket is defined (Top1 increases from 58% to 73% and from 27% to 37%, respectively) and when the whole protein defines the binding pocket (Top1 increases from 31% to 38% and from 12% to 16%, respectively). The derived ensemble of CNNs generalizes to unseen proteins and ligands and produces scores that correlate well with the root mean square deviation to the known binding pose. We provide the 1.0 version of GNINA under an open source license for use as a molecular docking tool at https://github.com/gnina/gnina .

6.
J Chem Inf Model ; 61(6): 2530-2536, 2021 06 28.
Artigo em Inglês | MEDLINE | ID: mdl-34038123

RESUMO

While accurate prediction of aqueous solubility remains a challenge in drug discovery, machine learning (ML) approaches have become increasingly popular for this task. For instance, in the Second Challenge to Predict Aqueous Solubility (SC2), all groups utilized machine learning methods in their submissions. We present SolTranNet, a molecule attention transformer to predict aqueous solubility from a molecule's SMILES representation. Atypically, we demonstrate that larger models perform worse at this task, with SolTranNet's final architecture having 3,393 parameters while outperforming linear ML approaches. SolTranNet has a 3-fold scaffold split cross-validation root-mean-square error (RMSE) of 1.459 on AqSolDB and an RMSE of 1.711 on a withheld test set. We also demonstrate that, when used as a classifier to filter out insoluble compounds, SolTranNet achieves a sensitivity of 94.8% on the SC2 data set and is competitive with the other methods submitted to the competition. SolTranNet is distributed via pip, and its source code is available at https://github.com/gnina/SolTranNet.


Assuntos
Aprendizado de Máquina , Água , Software , Solubilidade
7.
J Chem Inf Model ; 60(9): 4200-4215, 2020 09 28.
Artigo em Inglês | MEDLINE | ID: mdl-32865404

RESUMO

One of the main challenges in drug discovery is predicting protein-ligand binding affinity. Recently, machine learning approaches have made substantial progress on this task. However, current methods of model evaluation are overly optimistic in measuring generalization to new targets, and there does not exist a standard data set of sufficient size to compare performance between models. We present a new data set for structure-based machine learning, the CrossDocked2020 set, with 22.5 million poses of ligands docked into multiple similar binding pockets across the Protein Data Bank, and perform a comprehensive evaluation of grid-based convolutional neural network (CNN) models on this data set. We also demonstrate how the partitioning of the training data and test data can impact the results of models trained with the PDBbind data set, how performance improves by adding more lower-quality training data, and how training with docked poses imparts pose sensitivity to the predicted affinity of a complex. Our best performing model, an ensemble of five densely connected CNNs, achieves a root mean squared error of 1.42 and Pearson R of 0.612 on the affinity prediction task, an AUC of 0.956 at binding pose classification, and a 68.4% accuracy at pose selection on the CrossDocked2020 set. By providing data splits for clustered cross-validation and the raw data for the CrossDocked2020 set, we establish the first standardized data set for training machine learning models to recognize ligands in noncognate target structures while also greatly expanding the number of poses available for training. In order to facilitate community adoption of this data set for benchmarking protein-ligand binding affinity prediction, we provide our models, weights, and the CrossDocked2020 set at https://github.com/gnina/models.


Assuntos
Desenho de Fármacos , Redes Neurais de Computação , Bases de Dados de Proteínas , Ligantes , Ligação Proteica
8.
J Biol Chem ; 295(46): 15636-15649, 2020 11 13.
Artigo em Inglês | MEDLINE | ID: mdl-32883810

RESUMO

Clear-cell renal cell carcinoma (ccRCC), the most common subtype of renal cancer, has a poor clinical outcome. A hallmark of ccRCC is genetic loss-of-function of VHL (von Hippel-Lindau) that leads to a highly vascularized tumor microenvironment. Although many ccRCC patients initially respond to antiangiogenic therapies, virtually all develop progressive, drug-refractory disease. Given the role of dysregulated expressions of cytoskeletal and cytoskeleton-regulatory proteins in tumor progression, we performed analyses of The Cancer Genome Atlas (TCGA) transcriptome data for different classes of actin-binding proteins to demonstrate that increased mRNA expression of profilin1 (Pfn1), Arp3, cofilin1, Ena/VASP, and CapZ, is an indicator of poor prognosis in ccRCC. Focusing further on Pfn1, we performed immunohistochemistry-based classification of Pfn1 staining in tissue microarrays, which indicated Pfn1 positivity in both tumor and stromal cells; however, the vast majority of ccRCC tumors tend to be Pfn1-positive selectively in stromal cells only. This finding is further supported by evidence for dramatic transcriptional up-regulation of Pfn1 in tumor-associated vascular endothelial cells in the clinical specimens of ccRCC. In vitro studies support the importance of Pfn1 in proliferation and migration of RCC cells and in soluble Pfn1's involvement in vascular endothelial cell tumor cell cross-talk. Furthermore, proof-of-concept studies demonstrate that treatment with a novel computationally designed Pfn1-actin interaction inhibitor identified herein reduces proliferation and migration of RCC cells in vitro and RCC tumor growth in vivo Based on these findings, we propose a potentiating role for Pfn1 in promoting tumor cell aggressiveness in the setting of ccRCC.


Assuntos
Carcinoma de Células Renais/patologia , Neoplasias Renais/patologia , Profilinas/metabolismo , Actinas/antagonistas & inibidores , Actinas/metabolismo , Animais , Proteína de Capeamento de Actina CapZ/genética , Proteína de Capeamento de Actina CapZ/metabolismo , Carcinoma de Células Renais/metabolismo , Linhagem Celular Tumoral , Movimento Celular , Proliferação de Células , Cofilina 1/genética , Cofilina 1/metabolismo , Bases de Dados Genéticas , Células Endoteliais/citologia , Células Endoteliais/metabolismo , Humanos , Neoplasias Renais/metabolismo , Camundongos , Camundongos Endogâmicos BALB C , Profilinas/antagonistas & inibidores , Profilinas/genética , Prognóstico , Interferência de RNA , RNA Interferente Pequeno/metabolismo , Microambiente Tumoral , Regulação para Cima
9.
J Comput Aided Mol Des ; 33(1): 19-34, 2019 01.
Artigo em Inglês | MEDLINE | ID: mdl-29992528

RESUMO

We assess the ability of our convolutional neural network (CNN)-based scoring functions to perform several common tasks in the domain of drug discovery. These include correctly identifying ligand poses near and far from the true binding mode when given a set of reference receptors and classifying ligands as active or inactive using structural information. We use the CNN to re-score or refine poses generated using a conventional scoring function, Autodock Vina, and compare the performance of each of these methods to using the conventional scoring function alone. Furthermore, we assess several ways of choosing appropriate reference receptors in the context of the D3R 2017 community benchmarking challenge. We find that our CNN scoring function outperforms Vina on most tasks without requiring manual inspection by a knowledgeable operator, but that the pose prediction target chosen for the challenge, Cathepsin S, was particularly challenging for de novo docking. However, the CNN provided best-in-class performance on several virtual screening tasks, underscoring the relevance of deep learning to the field of drug discovery.


Assuntos
Catepsinas/química , Simulação de Acoplamento Molecular , Redes Neurais de Computação , Algoritmos , Sítios de Ligação , Bases de Dados de Proteínas , Descoberta de Drogas/métodos , Ligantes , Ligação Proteica , Conformação Proteica , Relação Estrutura-Atividade
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...