Search | VHL Regional Portal

Accurate prediction of RNA secondary structure including pseudoknots through solving minimum-cost flow with learned potentials.

Gong, Tiansu; Ju, Fusong; Bu, Dongbo.

Commun Biol ; 7(1): 297, 2024 Mar 09.

Article in English | MEDLINE | ID: mdl-38461362

ABSTRACT

Pseudoknots are key structure motifs of RNA and pseudoknotted RNAs play important roles in a variety of biological processes. Here, we present KnotFold, an accurate approach to the prediction of RNA secondary structure including pseudoknots. The key elements of KnotFold include a learned potential function and a minimum-cost flow algorithm to find the secondary structure with the lowest potential. KnotFold learns the potential from the RNAs with known structures using an attention-based neural network, thus avoiding the inaccuracy of hand-crafted energy functions. The specially designed minimum-cost flow algorithm used by KnotFold considers all possible combinations of base pairs and selects from them the optimal combination. The algorithm breaks the restriction of nested base pairs required by the widely used dynamic programming algorithms, thus enabling the identification of pseudoknots. Using 1,009 pseudoknotted RNAs as representatives, we demonstrate the successful application of KnotFold in predicting RNA secondary structures including pseudoknots with accuracy higher than the state-of-the-art approaches. We anticipate that KnotFold, with its superior accuracy, will greatly facilitate the understanding of RNA structures and functionalities.

Subject(s)

Algorithms , RNA , RNA/genetics , Nucleic Acid Conformation , Base Pairing , Neural Networks, Computer

SASA-Net: A Spatial-Aware Self-Attention Mechanism for Building Protein 3D Structure Directly From Inter- Residue Distances.

Gong, Tiansu; Ju, Fusong; Sun, Shiwei; Bu, Dongbo.

IEEE/ACM Trans Comput Biol Bioinform ; 20(6): 3482-3488, 2023.

Article in English | MEDLINE | ID: mdl-37022274

ABSTRACT

Protein functions are tightly related to the fine details of their 3D structures. To understand protein structures, computational prediction approaches are highly needed. Recently, protein structure prediction has achieved considerable progresses mainly due to the increased accuracy of inter-residue distance estimation and the application of deep learning techniques. Most of the distance-based ab initio prediction approaches adopt a two-step diagram: constructing a potential function based on the estimated inter-residue distances, and then build a 3D structure that minimizes the potential function. These approaches have proven very promising; however, they still suffer from several limitations, especially the inaccuracies incurred by the handcrafted potential function. Here, we present SASA-Net, a deep learning-based approach that directly learns protein 3D structure from the estimated inter-residue distances. Unlike the existing approach simply representing protein structures as coordinates of atoms, SASA-Net represents protein structures using pose of residues, i.e., the coordinate system of each individual residue in which all backbone atoms of this residue are fixed. The key element of SASA-Net is a spatial-aware self-attention mechanism, which is able to adjust a residue's pose according to all other residues' features and the estimated distances between residues. By iteratively applying the spatial-aware self-attention mechanism, SASA-Net continuously improves the structure and finally acquires a structure with high accuracy. Using the CATH35 proteins as representatives, we demonstrate that SASA-Net is able to accurately and efficiently build structures from the estimated inter-residue distances. The high accuracy and efficiency of SASA-Net enables an end-to-end neural network model for protein structure prediction through combining SASA-Net and an neural network for inter-residue distance prediction. Source code of SASA-Net is available at https://github.com/gongtiansu/SASA-Net/.

Subject(s)

Algorithms , Computational Biology , Computational Biology/methods , Proteins/chemistry , Neural Networks, Computer , Software

Protein Structure Prediction: Challenges, Advances, and the Shift of Research Paradigms.

Huang, Bin; Kong, Lupeng; Wang, Chao; Ju, Fusong; Zhang, Qi; Zhu, Jianwei; Gong, Tiansu; Zhang, Haicang; Yu, Chungong; Zheng, Wei-Mou; Bu, Dongbo.

Genomics Proteomics Bioinformatics ; 21(5): 913-925, 2023 Oct.

Article in English | MEDLINE | ID: mdl-37001856

ABSTRACT

Protein structure prediction is an interdisciplinary research topic that has attracted researchers from multiple fields, including biochemistry, medicine, physics, mathematics, and computer science. These researchers adopt various research paradigms to attack the same structure prediction problem: biochemists and physicists attempt to reveal the principles governing protein folding; mathematicians, especially statisticians, usually start from assuming a probability distribution of protein structures given a target sequence and then find the most likely structure, while computer scientists formulate protein structure prediction as an optimization problem - finding the structural conformation with the lowest energy or minimizing the difference between predicted structure and native structure. These research paradigms fall into the two statistical modeling cultures proposed by Leo Breiman, namely, data modeling and algorithmic modeling. Recently, we have also witnessed the great success of deep learning in protein structure prediction. In this review, we present a survey of the efforts for protein structure prediction. We compare the research paradigms adopted by researchers from different fields, with an emphasis on the shift of research paradigms in the era of deep learning. In short, the algorithmic modeling techniques, especially deep neural networks, have considerably improved the accuracy of protein structure prediction; however, theories interpreting the neural networks and knowledge on protein folding are still highly desired.

Subject(s)

Algorithms , Proteins , Protein Conformation , Proteins/chemistry , Neural Networks, Computer , Protein Folding , Computational Biology/methods

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL