Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 5 de 5
Filter
Add more filters










Database
Language
Publication year range
1.
ACS Omega ; 9(7): 7471-7479, 2024 Feb 20.
Article in English | MEDLINE | ID: mdl-38405499

ABSTRACT

Computational prediction of molecule-protein interactions has been key for developing new molecules to interact with a target protein for therapeutics development. Previous work includes two independent streams of approaches: (1) predicting protein-protein interactions (PPIs) between naturally occurring proteins and (2) predicting binding affinities between proteins and small-molecule ligands [also known as drug-target interaction (DTI)]. Studying the two problems in isolation has limited the ability of these computational models to generalize across the PPI and DTI tasks, both of which ultimately involve noncovalent interactions with a protein target. In this work, we developed Equivariant Graph of Graphs neural Network (EGGNet), a geometric deep learning (GDL) framework, for molecule-protein binding predictions that can handle three types of molecules for interacting with a target protein: (1) small molecules, (2) synthetic peptides, and (3) natural proteins. EGGNet leverages a graph of graphs (GoG) representation constructed from the molecular structures at atomic resolution and utilizes a multiresolution equivariant graph neural network to learn from such representations. In addition, EGGNet leverages the underlying biophysics and makes use of both atom- and residue-level interactions, which improve EGGNet's ability to rank candidate poses from blind docking. EGGNet achieves competitive performance on both a public protein-small-molecule binding affinity prediction task (80.2% top 1 success rate on CASF-2016) and a synthetic protein interface prediction task (88.4% area under the precision-recall curve). We envision that the proposed GDL framework can generalize to many other protein interaction prediction problems, such as binding site prediction and molecular docking, helping accelerate protein engineering and structure-based drug development.

2.
Sci Rep ; 12(1): 6832, 2022 04 27.
Article in English | MEDLINE | ID: mdl-35477726

ABSTRACT

Proteins perform many essential functions in biological systems and can be successfully developed as bio-therapeutics. It is invaluable to be able to predict their properties based on a proposed sequence and structure. In this study, we developed a novel generalizable deep learning framework, LM-GVP, composed of a protein Language Model (LM) and Graph Neural Network (GNN) to leverage information from both 1D amino acid sequences and 3D structures of proteins. Our approach outperformed the state-of-the-art protein LMs on a variety of property prediction tasks including fluorescence, protease stability, and protein functions from Gene Ontology (GO). We also illustrated insights into how a GNN prediction head can inform the fine-tuning of protein LMs to better leverage structural information. We envision that our deep learning framework will be generalizable to many protein property prediction problems to greatly accelerate protein engineering and drug development.


Subject(s)
Deep Learning , Amino Acid Sequence , Language , Neural Networks, Computer , Proteins/chemistry
3.
J Chem Inf Model ; 59(3): 1005-1016, 2019 03 25.
Article in English | MEDLINE | ID: mdl-30586300

ABSTRACT

Deep learning has drawn significant attention in different areas including drug discovery. It has been proposed that it could outperform other machine learning algorithms, especially with big data sets. In the field of pharmaceutical industry, machine learning models are built to understand quantitative structure-activity relationships (QSARs) and predict molecular activities, including absorption, distribution, metabolism, and excretion (ADME) properties, using only molecular structures. Previous reports have demonstrated the advantages of using deep neural networks (DNNs) for QSAR modeling. One of the challenges while building DNN models is identifying the hyperparameters that lead to better generalization of the models. In this study, we investigated several tunable hyperparameters of deep neural network models on 24 industrial ADME data sets. We analyzed the sensitivity and influence of five different hyperparameters including the learning rate, weight decay for L2 regularization, dropout rate, activation function, and the use of batch normalization. This paper focuses on strategies and practices for DNN model building. Further, the optimized model for each data set was built and compared with the benchmark models used in production. Based on our benchmarking results, we propose several practices for building DNN QSAR models.


Subject(s)
Deep Learning , Drug Discovery/methods , Absorption, Physicochemical , Pharmaceutical Preparations/chemistry , Pharmaceutical Preparations/metabolism , Quantitative Structure-Activity Relationship
4.
J Chem Inf Model ; 58(5): 1021-1036, 2018 05 29.
Article in English | MEDLINE | ID: mdl-29641200

ABSTRACT

Partial covalent interactions (PCIs) in proteins, which include hydrogen bonds, salt bridges, cation-π, and π-π interactions, contribute to thermodynamic stability and facilitate interactions with other biomolecules. Several score functions have been developed within the Rosetta protein modeling framework that identify and evaluate these PCIs through analyzing the geometry between participating atoms. However, we hypothesize that PCIs can be unified through a simplified electron orbital representation. To test this hypothesis, we have introduced orbital based chemical descriptors for PCIs into Rosetta, called the PCI score function. Optimal geometries for the PCIs are derived from a statistical analysis of high-quality protein structures obtained from the Protein Data Bank (PDB), and the relative orientation of electron deficient hydrogen atoms and electron-rich lone pair or π orbitals are evaluated. We demonstrate that nativelike geometries of hydrogen bonds, salt bridges, cation-π, and π-π interactions are recapitulated during minimization of protein conformation. The packing density of tested protein structures increased from the standard score function from 0.62 to 0.64, closer to the native value of 0.70. Overall, rotamer recovery improved when using the PCI score function (75%) as compared to the standard Rosetta score function (74%). The PCI score function represents an improvement over the standard Rosetta score function for protein model scoring; in addition, it provides a platform for future directions in the analysis of small molecule to protein interactions, which depend on partial covalent interactions.


Subject(s)
Models, Molecular , Proteins/chemistry , Crystallography, X-Ray , Databases, Protein , Electrons , Hydrogen Bonding , Protein Conformation , Rotation
5.
Nat Protoc ; 8(7): 1277-98, 2013.
Article in English | MEDLINE | ID: mdl-23744289

ABSTRACT

Structure-based drug design is frequently used to accelerate the development of small-molecule therapeutics. Although substantial progress has been made in X-ray crystallography and nuclear magnetic resonance (NMR) spectroscopy, the availability of high-resolution structures is limited owing to the frequent inability to crystallize or obtain sufficient NMR restraints for large or flexible proteins. Computational methods can be used to both predict unknown protein structures and model ligand interactions when experimental data are unavailable. This paper describes a comprehensive and detailed protocol using the Rosetta modeling suite to dock small-molecule ligands into comparative models. In the protocol presented here, we review the comparative modeling process, including sequence alignment, threading and loop building. Next, we cover docking a small-molecule ligand into the protein comparative model. In addition, we discuss criteria that can improve ligand docking into comparative models. Finally, and importantly, we present a strategy for assessing model quality. The entire protocol is presented on a single example selected solely for didactic purposes. The results are therefore not representative and do not replace benchmarks published elsewhere. We also provide an additional tutorial so that the user can gain hands-on experience in using Rosetta. The protocol should take 5-7 h, with additional time allocated for computer generation of models.


Subject(s)
Models, Molecular , Molecular Docking Simulation , Protein Conformation , Drug Design , Ligands , Sequence Alignment/methods , Software , User-Computer Interface
SELECTION OF CITATIONS
SEARCH DETAIL
...