Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 7 de 7
Filter
Add more filters










Database
Language
Publication year range
1.
Phys Chem Chem Phys ; 26(1): 130-143, 2023 Dec 21.
Article in English | MEDLINE | ID: mdl-38063012

ABSTRACT

Biological processes such as transcription, repair, and regulation require interactions between DNA and proteins. To unravel their functions, it is imperative to determine the high-resolution structures of DNA-protein complexes. However, experimental methods for this purpose are costly and technically demanding. Consequently, there is an urgent need for computational techniques to identify the structures of DNA-protein complexes. Despite technological advancements, accurately identifying DNA-protein complexes through computational methods still poses a challenge. Our team has developed a cutting-edge deep-learning approach called DDPScore that assesses DNA-protein complex structures. DDPScore utilizes a 4D convolutional neural network to overcome limited training data. This approach effectively captures local and global features while comprehensively considering the conformational changes arising from the flexibility during the DNA-protein docking process. DDPScore consistently outperformed the available methods in comprehensive DNA-protein complex docking evaluations, even for the flexible docking challenges. DDPScore has a wide range of applications in predicting and designing structures of DNA-protein complexes.


Subject(s)
Deep Learning , Proteins/chemistry , Neural Networks, Computer , Research Design , DNA/chemistry , Protein Binding
2.
Brief Bioinform ; 25(1)2023 11 22.
Article in English | MEDLINE | ID: mdl-38145947

ABSTRACT

Determining the RNA binding preferences remains challenging because of the bottleneck of the binding interactions accompanied by subtle RNA flexibility. Typically, designing RNA inhibitors involves screening thousands of potential candidates for binding. Accurate binding site information can increase the number of successful hits even with few candidates. There are two main issues regarding RNA binding preference: binding site prediction and binding dynamical behavior prediction. Here, we propose one interpretable network-based approach, RNet, to acquire precise binding site and binding dynamical behavior information. RNetsite employs a machine learning-based network decomposition algorithm to predict RNA binding sites by analyzing the local and global network properties. Our research focuses on large RNAs with 3D structures without considering smaller regulatory RNAs, which are too small and dynamic. Our study shows that RNetsite outperforms existing methods, achieving precision values as high as 0.701 on TE18 and 0.788 on RB9 tests. In addition, RNetsite demonstrates remarkable robustness regarding perturbations in RNA structures. We also developed RNetdyn, a distance-based dynamical graph algorithm, to characterize the interface dynamical behavior consequences upon inhibitor binding. The simulation testing of competitive inhibitors indicates that RNetdyn outperforms the traditional method by 30%. The benchmark testing results demonstrate that RNet is highly accurate and robust. Our interpretable network algorithms can assist in predicting RNA binding preferences and accelerating RNA inhibitor design, providing valuable insights to the RNA research community.


Subject(s)
Computational Biology , RNA-Binding Proteins , Computational Biology/methods , RNA-Binding Proteins/metabolism , Algorithms , Binding Sites , RNA/metabolism
3.
Nat Commun ; 14(1): 1060, 2023 02 24.
Article in English | MEDLINE | ID: mdl-36828844

ABSTRACT

RNA-protein complexes underlie numerous cellular processes, including basic translation and gene regulation. The high-resolution structure determination of the RNA-protein complexes is essential for elucidating their functions. Therefore, computational methods capable of identifying the native-like RNA-protein structures are needed. To address this challenge, we thus develop DRPScore, a deep-learning-based approach for identifying native-like RNA-protein structures. DRPScore is tested on representative sets of RNA-protein complexes with various degrees of binding-induced conformation change ranging from fully rigid docking (bound-bound) to fully flexible docking (unbound-unbound). Out of the top 20 predictions, DRPScore selects native-like structures with a success rate of 91.67% on the testing set of bound RNA-protein complexes and 56.14% on the unbound complexes. DRPScore consistently outperforms existing methods with a roughly 10.53-15.79% improvement, even for the most difficult unbound cases. Furthermore, DRPScore significantly improves the accuracy of the native interface interaction predictions. DRPScore should be broadly useful for modeling and designing RNA-protein complexes.


Subject(s)
Deep Learning , Protein Binding , Models, Molecular , Proteins/metabolism , RNA/metabolism , Protein Conformation , Molecular Docking Simulation , Algorithms
4.
BMC Bioinformatics ; 20(1): 497, 2019 Oct 15.
Article in English | MEDLINE | ID: mdl-31615418

ABSTRACT

BACKGROUND: It is widely believed that tertiary nucleotide-nucleotide interactions are essential in determining RNA structure and function. Currently, direct coupling analysis (DCA) infers nucleotide contacts in a sequence from its homologous sequence alignment across different species. DCA and similar approaches that use sequence information alone typically yield a low accuracy, especially when the available homologous sequences are limited. Therefore, new methods for RNA structural contact inference are desirable because even a single correctly predicted tertiary contact can potentially make the difference between a correct and incorrectly predicted structure. Here we present a new method DIRECT (Direct Information REweighted by Contact Templates) that incorporates a Restricted Boltzmann Machine (RBM) to augment the information on sequence co-variations with structural features in contact inference. RESULTS: Benchmark tests demonstrate that DIRECT achieves better overall performance than DCA approaches. Compared to mfDCA and plmDCA, DIRECT produces a substantial increase of 41 and 18%, respectively, in accuracy on average for contact prediction. DIRECT improves predictions for long-range contacts and captures more tertiary structural features. CONCLUSIONS: We developed a hybrid approach that incorporates a Restricted Boltzmann Machine (RBM) to augment the information on sequence co-variations with structural templates in contact inference. Our results demonstrate that DIRECT is able to improve the RNA contact prediction.


Subject(s)
Algorithms , Models, Molecular , Nucleic Acid Conformation , Sequence Analysis, RNA/methods , Software
5.
Anal Chem ; 91(9): 5768-5776, 2019 05 07.
Article in English | MEDLINE | ID: mdl-30929422

ABSTRACT

Recent developments in high-resolution mass spectrometry (HRMS) technology enabled ultrasensitive detection of proteins, peptides, and metabolites in limited amounts of samples, even single cells. However, extraction of trace-abundance signals from complex data sets ( m/ z value, separation time, signal abundance) that result from ultrasensitive studies requires improved data processing algorithms. To bridge this gap, we here developed "Trace", a software framework that incorporates machine learning (ML) to automate feature selection and optimization for the extraction of trace-level signals from HRMS data. The method was validated using primary (raw) and manually curated data sets from single-cell metabolomic studies of the South African clawed frog ( Xenopus laevis) embryo using capillary electrophoresis electrospray ionization HRMS. We demonstrated that Trace combines sensitivity, accuracy, and robustness with high data processing throughput to recognize signals, including those previously identified as metabolites in single-cell capillary electrophoresis HRMS measurements that we conducted over several months. These performance metrics combined with a compatibility with MS data in open-source (mzML) format make Trace an attractive software resource to facilitate data analysis for studies employing ultrasensitive high-resolution MS.


Subject(s)
Embryo, Nonmammalian/metabolism , Machine Learning , Metabolome , Single-Cell Analysis/methods , Software , Spectrometry, Mass, Electrospray Ionization/methods , Xenopus laevis/metabolism , Animals , Electrophoresis, Capillary
6.
Bioinformatics ; 34(18): 3131-3136, 2018 09 15.
Article in English | MEDLINE | ID: mdl-29718097

ABSTRACT

Motivation: Non-coding RNA molecules play essential roles by interacting with other molecules to perform various biological functions. However, it is difficult to determine RNA structures due to their flexibility. At present, the number of experimentally solved RNA-ligand and RNA-protein structures is still insufficient. Therefore, binding sites prediction of non-coding RNA is required to understand their functions. Results: Current RNA binding site prediction algorithms produce many false positive nucleotides that are distance away from the binding sites. Here, we present a network approach, RBind, to predict the RNA binding sites. We benchmarked RBind in RNA-ligand and RNA-protein datasets. The average accuracy of 0.82 in RNA-ligand and 0.63 in RNA-protein testing showed that this network strategy has a reliable accuracy for binding sites prediction. Availability and implementation: The codes and datasets are available at https://zhaolab.com.cn/RBind. Supplementary information: Supplementary data are available at Bioinformatics online.


Subject(s)
Proteins/chemistry , RNA/chemistry , Algorithms , Binding Sites , Computational Biology , Humans , Protein Domains , Proteins/metabolism , RNA/metabolism , Software
7.
Sci Rep ; 7(1): 2876, 2017 06 06.
Article in English | MEDLINE | ID: mdl-28588265

ABSTRACT

The bulb-type lectins are proteins consist of three sequential beta-sheet subdomains that bind to specific carbohydrates to perform certain biological functions. The active states of most bulb-type lectins are dimeric and it is thus important to elucidate the short- and long-range recognition mechanism for this dimer formation. To do so, we perform comparative sequence analysis for the single- and double-domain bulb-type lectins abundant in plant genomes. In contrast to the dimer complex of two single-domain lectins formed via protein-protein interactions, the double-domain lectin fuses two single-domain proteins into one protein with a short linker and requires only short-range interactions because its two single domains are always in close proximity. Sequence analysis demonstrates that the highly variable but coevolving polar residues at the interface of dimeric bulb-type lectins are largely absent in the double-domain bulb-type lectins. Moreover, network analysis on bulb-type lectin proteins show that these same polar residues have high closeness scores and thus serve as hubs with strong connections to all other residues. Taken together, we propose a potential mechanism for this lectin complex formation where coevolving polar residues of high closeness are responsible for long-range recognition.


Subject(s)
Models, Molecular , Plant Lectins/chemistry , Plant Lectins/metabolism , Protein Conformation , Protein Multimerization , Algorithms , Binding Sites , Mannose/chemistry , Mannose/metabolism , Protein Binding , Structure-Activity Relationship
SELECTION OF CITATIONS
SEARCH DETAIL
...