Search | VHL Regional Portal

1.

EquiPNAS: improved protein-nucleic acid binding site prediction using protein-language-model-informed equivariant deep graph neural networks.

Roche, Rahmatullah; Moussad, Bernard; Shuvo, Md Hossain; Tarafder, Sumit; Bhattacharya, Debswapna.

Nucleic Acids Res ; 52(5): e27, 2024 Mar 21.

Article in English | MEDLINE | ID: mdl-38281252

ABSTRACT

Protein language models (pLMs) trained on a large corpus of protein sequences have shown unprecedented scalability and broad generalizability in a wide range of predictive modeling tasks, but their power has not yet been harnessed for predicting protein-nucleic acid binding sites, critical for characterizing the interactions between proteins and nucleic acids. Here, we present EquiPNAS, a new pLM-informed E(3) equivariant deep graph neural network framework for improved protein-nucleic acid binding site prediction. By combining the strengths of pLM and symmetry-aware deep graph learning, EquiPNAS consistently outperforms the state-of-the-art methods for both protein-DNA and protein-RNA binding site prediction on multiple datasets across a diverse set of predictive modeling scenarios ranging from using experimental input to AlphaFold2 predictions. Our ablation study reveals that the pLM embeddings used in EquiPNAS are sufficiently powerful to dramatically reduce the dependence on the availability of evolutionary information without compromising on accuracy, and that the symmetry-aware nature of the E(3) equivariant graph-based neural architecture offers remarkable robustness and performance resilience. EquiPNAS is freely available at https://github.com/Bhattacharya-Lab/EquiPNAS.

Subject(s)

Neural Networks, Computer , Nucleic Acids , Proteins , Amino Acid Sequence , Binding Sites , Nucleic Acids/chemistry , Proteins/chemistry

2.

EquiPNAS: improved protein-nucleic acid binding site prediction using protein-language-model-informed equivariant deep graph neural networks.

Roche, Rahmatullah; Moussad, Bernard; Shuvo, Md Hossain; Tarafder, Sumit; Bhattacharya, Debswapna.

bioRxiv ; 2023 Sep 16.

Article in English | MEDLINE | ID: mdl-37745556

ABSTRACT

Protein language models (pLMs) trained on a large corpus of protein sequences have shown unprecedented scalability and broad generalizability in a wide range of predictive modeling tasks, but their power has not yet been harnessed for predicting protein-nucleic acid binding sites, critical for characterizing the interactions between proteins and nucleic acids. Here we present EquiPNAS, a new pLM-informed E(3) equivariant deep graph neural network framework for improved protein-nucleic acid binding site prediction. By combining the strengths of pLM and symmetry-aware deep graph learning, EquiPNAS consistently outperforms the state-of-the-art methods for both protein-DNA and protein-RNA binding site prediction on multiple datasets across a diverse set of predictive modeling scenarios ranging from using experimental input to AlphaFold2 predictions. Our ablation study reveals that the pLM embeddings used in EquiPNAS are sufficiently powerful to dramatically reduce the dependence on the availability of evolutionary information without compromising on accuracy, and that the symmetry-aware nature of the E(3) equivariant graph-based neural architecture offers remarkable robustness and performance resilience. EquiPNAS is freely available at https://github.com/Bhattacharya-Lab/EquiPNAS.

3.

E(3) equivariant graph neural networks for robust and accurate protein-protein interaction site prediction.

Roche, Rahmatullah; Moussad, Bernard; Shuvo, Md Hossain; Bhattacharya, Debswapna.

PLoS Comput Biol ; 19(8): e1011435, 2023 08.

Article in English | MEDLINE | ID: mdl-37651442

ABSTRACT

Artificial intelligence-powered protein structure prediction methods have led to a paradigm-shift in computational structural biology, yet contemporary approaches for predicting the interfacial residues (i.e., sites) of protein-protein interaction (PPI) still rely on experimental structures. Recent studies have demonstrated benefits of employing graph convolution for PPI site prediction, but ignore symmetries naturally occurring in 3-dimensional space and act only on experimental coordinates. Here we present EquiPPIS, an E(3) equivariant graph neural network approach for PPI site prediction. EquiPPIS employs symmetry-aware graph convolutions that transform equivariantly with translation, rotation, and reflection in 3D space, providing richer representations for molecular data compared to invariant convolutions. EquiPPIS substantially outperforms state-of-the-art approaches based on the same experimental input, and exhibits remarkable robustness by attaining better accuracy with predicted structural models from AlphaFold2 than what existing methods can achieve even with experimental structures. Freely available at https://github.com/Bhattacharya-Lab/EquiPPIS, EquiPPIS enables accurate PPI site prediction at scale.

Subject(s)

Artificial Intelligence , Neural Networks, Computer , Computational Biology , Rotation , Software

4.

PIQLE: protein-protein interface quality estimation by deep graph learning of multimeric interaction geometries.

Shuvo, Md Hossain; Karim, Mohimenul; Roche, Rahmatullah; Bhattacharya, Debswapna.

Bioinform Adv ; 3(1): vbad070, 2023.

Article in English | MEDLINE | ID: mdl-37351310

ABSTRACT

Motivation: Accurate modeling of protein-protein interaction interface is essential for high-quality protein complex structure prediction. Existing approaches for estimating the quality of a predicted protein complex structural model utilize only the physicochemical properties or energetic contributions of the interacting atoms, ignoring evolutionarily information or inter-atomic multimeric geometries, including interaction distance and orientations. Results: Here, we present PIQLE, a deep graph learning method for protein-protein interface quality estimation. PIQLE leverages multimeric interaction geometries and evolutionarily information along with sequence- and structure-derived features to estimate the quality of individual interactions between the interfacial residues using a multi-head graph attention network and then probabilistically combines the estimated quality for scoring the overall interface. Experimental results show that PIQLE consistently outperforms existing state-of-the-art methods including DProQA, TRScore, GNN-DOVE and DOVE on multiple independent test datasets across a wide range of evaluation metrics. Our ablation study and comparison with the self-assessment module of AlphaFold-Multimer repurposed for protein complex scoring reveal that the performance gains are connected to the effectiveness of the multi-head graph attention network in leveraging multimeric interaction geometries and evolutionary information along with other sequence- and structure-derived features adopted in PIQLE. Availability and implementation: An open-source software implementation of PIQLE is freely available at https://github.com/Bhattacharya-Lab/PIQLE. Supplementary information: Supplementary data are available at Bioinformatics Advances online.

5.

iQDeep: an integrated web server for protein scoring using multiscale deep learning models.

Shuvo, Md Hossain; Karim, Mohimenul; Bhattacharya, Debswapna.

J Mol Biol ; 435(14): 168057, 2023 07 15.

Article in English | MEDLINE | ID: mdl-37356909

ABSTRACT

The remarkable recent advances in protein structure prediction have enabled computational modeling of protein structures with considerably higher accuracy than ever before. While state-of-the-art structure prediction methods provide self-assessment confidence scores of their own predictions, an independent and open-access system for protein scoring is still needed that can be applied to a broad range of predictive modeling scenarios. Here, we present iQDeep, an integrated and highly customizable web server for protein scoring, freely available at http://fusion.cs.vt.edu/iQDeep. The underlying method of iQDeep employs multiscale deep residual neural networks (ResNets) to perform residue-level error classifications, and then probabilistically combines the error classifications for protein scoring. By adjusting the error resolutions, our method can reliably estimate the standard- or high-accuracy variants of the Global Distance Test metric for versatile protein scoring. The performance of the method has been extensively tested and compared against the state-of-the-art approaches in multiple rounds of Critical Assessment of Techniques for Protein Structure Prediction (CASP) experiments including benchmark assessment in CASP12 and CASP13 as well as blind evaluation in CASP14. The iQDeep web server offers a number of convenient features, including (i) the choice of individual and batch processing modes; (ii) an interactive and privacy-preserving web interface for automated job submission, tracking, and results retrieval; (iii) web-based quantitative and visual analyses of the results including overall estimated score and its residue-wise breakdown along with agreements between various sequence- and structural-level features; (iv) extensive help information on job submission and results interpretation via web-based tutorial and help tooltips.

Subject(s)

Deep Learning , Protein Conformation , Software , Computational Biology/methods , Proteins/chemistry , Sequence Analysis, Protein/methods

6.

Contact-Assisted Threading in Low-Homology Protein Modeling.

Bhattacharya, Sutanu; Roche, Rahmatullah; Shuvo, Md Hossain; Moussad, Bernard; Bhattacharya, Debswapna.

Methods Mol Biol ; 2627: 41-59, 2023.

Article in English | MEDLINE | ID: mdl-36959441

ABSTRACT

The ability to successfully predict the three-dimensional structure of a protein from its amino acid sequence has made considerable progress in the recent past. The progress is propelled by the improved accuracy of deep learning-based inter-residue contact map predictors coupled with the rising growth of protein sequence databases. Contact map encodes interatomic interaction information that can be exploited for highly accurate prediction of protein structures via contact map threading even for the query proteins that are not amenable to direct homology modeling. As such, contact-assisted threading has garnered considerable research effort. In this chapter, we provide an overview of existing contact-assisted threading methods while highlighting the recent advances and discussing some of the current limitations and future prospects in the application of contact-assisted threading for improving the accuracy of low-homology protein modeling.

Subject(s)

Algorithms , Sequence Analysis, Protein , Sequence Analysis, Protein/methods , Proteins/chemistry , Software , Amino Acid Sequence , Databases, Protein , Protein Conformation , Protein Folding

7.

PIQLE: protein-protein interface quality estimation by deep graph learning of multimeric interaction geometries.

Shuvo, Md Hossain; Karim, Mohimenul; Roche, Rahmatullah; Bhattacharya, Debswapna.

bioRxiv ; 2023 Feb 15.

Article in English | MEDLINE | ID: mdl-36824789

ABSTRACT

Accurate modeling of protein-protein interaction interface is essential for high-quality protein complex structure prediction. Existing approaches for estimating the quality of a predicted protein complex structural model utilize only the physicochemical properties or energetic contributions of the interacting atoms, ignoring evolutionarily information or inter-atomic multimeric geometries, including interaction distance and orientations. Here we present PIQLE, a deep graph learning method for protein-protein interface quality estimation. PIQLE leverages multimeric interaction geometries and evolutionarily information along with sequence- and structure-derived features to estimate the quality of the individual interactions between the interfacial residues using a multihead graph attention network and then probabilistically combines the estimated quality of the interfacial residues for scoring the overall interface. Experimental results show that PIQLE consistently outperforms existing state-of-the-art methods on multiple independent test datasets across a wide range of evaluation metrics. Our ablation study reveals that the performance gains are connected to the effectiveness of the multihead graph attention network in leveraging multimeric interaction geometries and evolutionary information along with other sequence- and structure-derived features adopted in PIQLE. An open-source software implementation of PIQLE, licensed under the GNU General Public License v3, is freely available at https://github.com/Bhattacharya-Lab/PIQLE .

8.

rrQNet: Protein contact map quality estimation by deep evolutionary reconciliation.

Roche, Rahmatullah; Bhattacharya, Sutanu; Shuvo, Md Hossain; Bhattacharya, Debswapna.

Proteins ; 90(12): 2023-2034, 2022 12.

Article in English | MEDLINE | ID: mdl-35751651

ABSTRACT

Protein contact maps have proven to be a valuable tool in the deep learning revolution of protein structure prediction, ushering in the recent breakthrough by AlphaFold2. However, self-assessment of the quality of predicted structures are typically performed at the granularity of three-dimensional coordinates as opposed to directly exploiting the rotation- and translation-invariant two-dimensional (2D) contact maps. Here, we present rrQNet, a deep learning method for self-assessment in 2D by contact map quality estimation. Our approach is based on the intuition that for a contact map to be of high quality, the residue pairs predicted to be in contact should be mutually consistent with the evolutionary context of the protein. The deep neural network architecture of rrQNet implements this intuition by cascading two deep modules-one encoding the evolutionary context and the other performing evolutionary reconciliation. The penultimate stage of rrQNet estimates the quality scores at the interacting residue-pair level, which are then aggregated for estimating the quality of a contact map. This design choice offers versatility at varied resolutions from individual residue pairs to full-fledged contact maps. Trained on multiple complementary sources of contact predictors, rrQNet facilitates generalizability across various contact maps. By rigorously testing using publicly available datasets and comparing against several in-house baseline approaches, we show that rrQNet accurately reproduces the true quality score of a predicted contact map and successfully distinguishes between accurate and inaccurate contact maps predicted by a wide variety of contact predictors. The open-source rrQNet software package is freely available at https://github.com/Bhattacharya-Lab/rrQNet.

Subject(s)

Computational Biology , Proteins , Computational Biology/methods , Proteins/chemistry , Neural Networks, Computer , Software , Biological Evolution

9.

Recent Advances in Protein Homology Detection Propelled by Inter-Residue Interaction Map Threading.

Bhattacharya, Sutanu; Roche, Rahmatullah; Shuvo, Md Hossain; Bhattacharya, Debswapna.

Front Mol Biosci ; 8: 643752, 2021.

Article in English | MEDLINE | ID: mdl-34046429

ABSTRACT

Sequence-based protein homology detection has emerged as one of the most sensitive and accurate approaches to protein structure prediction. Despite the success, homology detection remains very challenging for weakly homologous proteins with divergent evolutionary profile. Very recently, deep neural network architectures have shown promising progress in mining the coevolutionary signal encoded in multiple sequence alignments, leading to reasonably accurate estimation of inter-residue interaction maps, which serve as a rich source of additional information for improved homology detection. Here, we summarize the latest developments in protein homology detection driven by inter-residue interaction map threading. We highlight the emerging trends in distant-homology protein threading through the alignment of predicted interaction maps at various granularities ranging from binary contact maps to finer-grained distance and orientation maps as well as their combination. We also discuss some of the current limitations and possible future avenues to further enhance the sensitivity of protein homology detection.

10.

DeepRefiner: high-accuracy protein structure refinement by deep network calibration.

Shuvo, Md Hossain; Gulfam, Muhammad; Bhattacharya, Debswapna.

Nucleic Acids Res ; 49(W1): W147-W152, 2021 07 02.

Article in English | MEDLINE | ID: mdl-33999209

ABSTRACT

The DeepRefiner webserver, freely available at http://watson.cse.eng.auburn.edu/DeepRefiner/, is an interactive and fully configurable online system for high-accuracy protein structure refinement. Fuelled by deep learning, DeepRefiner offers the ability to leverage cutting-edge deep neural network architectures which can be calibrated for on-demand selection of adventurous or conservative refinement modes targeted at degree or consistency of refinement. The method has been extensively tested in the Critical Assessment of Techniques for Protein Structure Prediction (CASP) experiments under the group name 'Bhattacharya-Server' and was officially ranked as the No. 2 refinement server in CASP13 (second only to 'Seok-server' and outperforming all other refinement servers) and No. 2 refinement server in CASP14 (second only to 'FEIG-S' and outperforming all other refinement servers including 'Seok-server'). The DeepRefiner web interface offers a number of convenient features, including (i) fully customizable refinement job submission and validation; (ii) automated job status update, tracking, and notifications; (ii) interactive and interpretable web-based results retrieval with quantitative and visual analysis and (iv) extensive help information on job submission and results interpretation via web-based tutorial and help tooltips.

Subject(s)

Protein Conformation , Software , Deep Learning , Models, Molecular

11.

QDeep: distance-based protein model quality estimation by residue-level ensemble error classifications using stacked deep residual neural networks.

Shuvo, Md Hossain; Bhattacharya, Sutanu; Bhattacharya, Debswapna.

Bioinformatics ; 36(Suppl_1): i285-i291, 2020 07 01.

Article in English | MEDLINE | ID: mdl-32657397

ABSTRACT

MOTIVATION: Protein model quality estimation, in many ways, informs protein structure prediction. Despite their tight coupling, existing model quality estimation methods do not leverage inter-residue distance information or the latest technological breakthrough in deep learning that has recently revolutionized protein structure prediction. RESULTS: We present a new distance-based single-model quality estimation method called QDeep by harnessing the power of stacked deep residual neural networks (ResNets). Our method first employs stacked deep ResNets to perform residue-level ensemble error classifications at multiple predefined error thresholds, and then combines the predictions from the individual error classifiers for estimating the quality of a protein structural model. Experimental results show that our method consistently outperforms existing state-of-the-art methods including ProQ2, ProQ3, ProQ3D, ProQ4, 3DCNN, MESHI, and VoroMQA in multiple independent test datasets across a wide-range of accuracy measures; and that predicted distance information significantly contributes to the improved performance of QDeep. AVAILABILITY AND IMPLEMENTATION: https://github.com/Bhattacharya-Lab/QDeep. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Computational Biology , Neural Networks, Computer , Proteins

12.

SPECS: Integration of side-chain orientation and global distance-based measures for improved evaluation of protein structural models.

Alapati, Rahul; Shuvo, Md Hossain; Bhattacharya, Debswapna.

PLoS One ; 15(2): e0228245, 2020.

Article in English | MEDLINE | ID: mdl-32053611

ABSTRACT

Significant advancements in the field of protein structure prediction have necessitated the need for objective and robust evaluation of protein structural models by comparing predicted models against the experimentally determined native structures to quantitate their structural similarities. Existing protein model versus native similarity metrics either consider the distances between alpha carbon (Cα) or side-chain atoms for computing the similarity. However, side-chain orientation of a protein plays a critical role in defining its conformation at the atomic-level. Despite its importance, inclusion of side-chain orientation in structural similarity evaluation has not yet been addressed. Here, we present SPECS, a side-chain-orientation-included protein model-native similarity metric for improved evaluation of protein structural models. SPECS combines side-chain orientation and global distance based measures in an integrated framework using the united-residue model of polypeptide conformation for computing model-native similarity. Experimental results demonstrate that SPECS is a reliable measure for evaluating structural similarity at the global level including and beyond the accuracy of Cα positioning. Moreover, SPECS delivers superior performance in capturing local quality aspect compared to popular global Cα positioning-based metrics ranging from models at near-experimental accuracies to models with correct overall folds-making it a robust measure suitable for both high- and moderate-resolution models. Finally, SPECS is sensitive to minute variations in side-chain χ angles even for models with perfect Cα trace, revealing the power of including side-chain orientation. Collectively, SPECS is a versatile evaluation metric covering a wide spectrum of protein modeling scenarios and simultaneously captures complementary aspects of structural similarities at multiple levels of granularities. SPECS is freely available at http://watson.cse.eng.auburn.edu/SPECS/.

Subject(s)

Models, Molecular , Proteins/chemistry , Benchmarking , Carbon/chemistry

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL