Pesquisa | Portal Regional da BVS (teste)

Evaluation of DNA-protein complex structures using the deep learning method.

Zeng, Chengwei; Jian, Yiren; Zhuo, Chen; Li, Anbang; Zeng, Chen; Zhao, Yunjie.

Phys Chem Chem Phys ; 26(1): 130-143, 2023 Dec 21.

Artigo em Inglês | MEDLINE | ID: mdl-38063012

RESUMO

Biological processes such as transcription, repair, and regulation require interactions between DNA and proteins. To unravel their functions, it is imperative to determine the high-resolution structures of DNA-protein complexes. However, experimental methods for this purpose are costly and technically demanding. Consequently, there is an urgent need for computational techniques to identify the structures of DNA-protein complexes. Despite technological advancements, accurately identifying DNA-protein complexes through computational methods still poses a challenge. Our team has developed a cutting-edge deep-learning approach called DDPScore that assesses DNA-protein complex structures. DDPScore utilizes a 4D convolutional neural network to overcome limited training data. This approach effectively captures local and global features while comprehensively considering the conformational changes arising from the flexibility during the DNA-protein docking process. DDPScore consistently outperformed the available methods in comprehensive DNA-protein complex docking evaluations, even for the flexible docking challenges. DDPScore has a wide range of applications in predicting and designing structures of DNA-protein complexes.

Assuntos

Aprendizado Profundo , Proteínas/química , Redes Neurais de Computação , Projetos de Pesquisa , DNA/química , Ligação Proteica

RNet: a network strategy to predict RNA binding preferences.

Liu, Haoquan; Jian, Yiren; Hou, Jinxuan; Zeng, Chen; Zhao, Yunjie.

Brief Bioinform ; 25(1)2023 11 22.

Artigo em Inglês | MEDLINE | ID: mdl-38145947

RESUMO

Determining the RNA binding preferences remains challenging because of the bottleneck of the binding interactions accompanied by subtle RNA flexibility. Typically, designing RNA inhibitors involves screening thousands of potential candidates for binding. Accurate binding site information can increase the number of successful hits even with few candidates. There are two main issues regarding RNA binding preference: binding site prediction and binding dynamical behavior prediction. Here, we propose one interpretable network-based approach, RNet, to acquire precise binding site and binding dynamical behavior information. RNetsite employs a machine learning-based network decomposition algorithm to predict RNA binding sites by analyzing the local and global network properties. Our research focuses on large RNAs with 3D structures without considering smaller regulatory RNAs, which are too small and dynamic. Our study shows that RNetsite outperforms existing methods, achieving precision values as high as 0.701 on TE18 and 0.788 on RB9 tests. In addition, RNetsite demonstrates remarkable robustness regarding perturbations in RNA structures. We also developed RNetdyn, a distance-based dynamical graph algorithm, to characterize the interface dynamical behavior consequences upon inhibitor binding. The simulation testing of competitive inhibitors indicates that RNetdyn outperforms the traditional method by 30%. The benchmark testing results demonstrate that RNet is highly accurate and robust. Our interpretable network algorithms can assist in predicting RNA binding preferences and accelerating RNA inhibitor design, providing valuable insights to the RNA research community.

Assuntos

Biologia Computacional , Proteínas de Ligação a RNA , Biologia Computacional/métodos , Proteínas de Ligação a RNA/metabolismo , Algoritmos , Sítios de Ligação , RNA/metabolismo

Evaluating native-like structures of RNA-protein complexes through the deep learning method.

Zeng, Chengwei; Jian, Yiren; Vosoughi, Soroush; Zeng, Chen; Zhao, Yunjie.

Nat Commun ; 14(1): 1060, 2023 02 24.

Artigo em Inglês | MEDLINE | ID: mdl-36828844

RESUMO

RNA-protein complexes underlie numerous cellular processes, including basic translation and gene regulation. The high-resolution structure determination of the RNA-protein complexes is essential for elucidating their functions. Therefore, computational methods capable of identifying the native-like RNA-protein structures are needed. To address this challenge, we thus develop DRPScore, a deep-learning-based approach for identifying native-like RNA-protein structures. DRPScore is tested on representative sets of RNA-protein complexes with various degrees of binding-induced conformation change ranging from fully rigid docking (bound-bound) to fully flexible docking (unbound-unbound). Out of the top 20 predictions, DRPScore selects native-like structures with a success rate of 91.67% on the testing set of bound RNA-protein complexes and 56.14% on the unbound complexes. DRPScore consistently outperforms existing methods with a roughly 10.53-15.79% improvement, even for the most difficult unbound cases. Furthermore, DRPScore significantly improves the accuracy of the native interface interaction predictions. DRPScore should be broadly useful for modeling and designing RNA-protein complexes.

Assuntos

Aprendizado Profundo , Ligação Proteica , Modelos Moleculares , Proteínas/metabolismo , RNA/metabolismo , Conformação Proteica , Simulação de Acoplamento Molecular , Algoritmos

DIRECT: RNA contact predictions by integrating structural patterns.

Jian, Yiren; Wang, Xiaonan; Qiu, Jaidi; Wang, Huiwen; Liu, Zhichao; Zhao, Yunjie; Zeng, Chen.

BMC Bioinformatics ; 20(1): 497, 2019 Oct 15.

Artigo em Inglês | MEDLINE | ID: mdl-31615418

RESUMO

BACKGROUND: It is widely believed that tertiary nucleotide-nucleotide interactions are essential in determining RNA structure and function. Currently, direct coupling analysis (DCA) infers nucleotide contacts in a sequence from its homologous sequence alignment across different species. DCA and similar approaches that use sequence information alone typically yield a low accuracy, especially when the available homologous sequences are limited. Therefore, new methods for RNA structural contact inference are desirable because even a single correctly predicted tertiary contact can potentially make the difference between a correct and incorrectly predicted structure. Here we present a new method DIRECT (Direct Information REweighted by Contact Templates) that incorporates a Restricted Boltzmann Machine (RBM) to augment the information on sequence co-variations with structural features in contact inference. RESULTS: Benchmark tests demonstrate that DIRECT achieves better overall performance than DCA approaches. Compared to mfDCA and plmDCA, DIRECT produces a substantial increase of 41 and 18%, respectively, in accuracy on average for contact prediction. DIRECT improves predictions for long-range contacts and captures more tertiary structural features. CONCLUSIONS: We developed a hybrid approach that incorporates a Restricted Boltzmann Machine (RBM) to augment the information on sequence co-variations with structural templates in contact inference. Our results demonstrate that DIRECT is able to improve the RNA contact prediction.

Assuntos

Algoritmos , Modelos Moleculares , Conformação de Ácido Nucleico , Análise de Sequência de RNA/métodos , Software

Trace, Machine Learning of Signal Images for Trace-Sensitive Mass Spectrometry: A Case Study from Single-Cell Metabolomics.

Liu, Zhichao; Portero, Erika P; Jian, Yiren; Zhao, Yunjie; Onjiko, Rosemary M; Zeng, Chen; Nemes, Peter.

Anal Chem ; 91(9): 5768-5776, 2019 05 07.

Artigo em Inglês | MEDLINE | ID: mdl-30929422

RESUMO

Recent developments in high-resolution mass spectrometry (HRMS) technology enabled ultrasensitive detection of proteins, peptides, and metabolites in limited amounts of samples, even single cells. However, extraction of trace-abundance signals from complex data sets ( m/ z value, separation time, signal abundance) that result from ultrasensitive studies requires improved data processing algorithms. To bridge this gap, we here developed "Trace", a software framework that incorporates machine learning (ML) to automate feature selection and optimization for the extraction of trace-level signals from HRMS data. The method was validated using primary (raw) and manually curated data sets from single-cell metabolomic studies of the South African clawed frog ( Xenopus laevis) embryo using capillary electrophoresis electrospray ionization HRMS. We demonstrated that Trace combines sensitivity, accuracy, and robustness with high data processing throughput to recognize signals, including those previously identified as metabolites in single-cell capillary electrophoresis HRMS measurements that we conducted over several months. These performance metrics combined with a compatibility with MS data in open-source (mzML) format make Trace an attractive software resource to facilitate data analysis for studies employing ultrasensitive high-resolution MS.

Assuntos

Embrião não Mamífero/metabolismo , Aprendizado de Máquina , Metaboloma , Análise de Célula Única/métodos , Software , Espectrometria de Massas por Ionização por Electrospray/métodos , Xenopus laevis/metabolismo , Animais , Eletroforese Capilar

RBind: computational network method to predict RNA binding sites.

Wang, Kaili; Jian, Yiren; Wang, Huiwen; Zeng, Chen; Zhao, Yunjie.

Bioinformatics ; 34(18): 3131-3136, 2018 09 15.

Artigo em Inglês | MEDLINE | ID: mdl-29718097

RESUMO

Motivation: Non-coding RNA molecules play essential roles by interacting with other molecules to perform various biological functions. However, it is difficult to determine RNA structures due to their flexibility. At present, the number of experimentally solved RNA-ligand and RNA-protein structures is still insufficient. Therefore, binding sites prediction of non-coding RNA is required to understand their functions. Results: Current RNA binding site prediction algorithms produce many false positive nucleotides that are distance away from the binding sites. Here, we present a network approach, RBind, to predict the RNA binding sites. We benchmarked RBind in RNA-ligand and RNA-protein datasets. The average accuracy of 0.82 in RNA-ligand and 0.63 in RNA-protein testing showed that this network strategy has a reliable accuracy for binding sites prediction. Availability and implementation: The codes and datasets are available at https://zhaolab.com.cn/RBind. Supplementary information: Supplementary data are available at Bioinformatics online.

Assuntos

Proteínas/química , RNA/química , Algoritmos , Sítios de Ligação , Biologia Computacional , Humanos , Domínios Proteicos , Proteínas/metabolismo , RNA/metabolismo , Software

Network Analysis Reveals the Recognition Mechanism for Dimer Formation of Bulb-type Lectins.

Zhao, Yunjie; Jian, Yiren; Liu, Zhichao; Liu, Hang; Liu, Qin; Chen, Chanyou; Li, Zhangyong; Wang, Lu; Huang, H Howie; Zeng, Chen.

Sci Rep ; 7(1): 2876, 2017 06 06.

Artigo em Inglês | MEDLINE | ID: mdl-28588265

RESUMO

The bulb-type lectins are proteins consist of three sequential beta-sheet subdomains that bind to specific carbohydrates to perform certain biological functions. The active states of most bulb-type lectins are dimeric and it is thus important to elucidate the short- and long-range recognition mechanism for this dimer formation. To do so, we perform comparative sequence analysis for the single- and double-domain bulb-type lectins abundant in plant genomes. In contrast to the dimer complex of two single-domain lectins formed via protein-protein interactions, the double-domain lectin fuses two single-domain proteins into one protein with a short linker and requires only short-range interactions because its two single domains are always in close proximity. Sequence analysis demonstrates that the highly variable but coevolving polar residues at the interface of dimeric bulb-type lectins are largely absent in the double-domain bulb-type lectins. Moreover, network analysis on bulb-type lectin proteins show that these same polar residues have high closeness scores and thus serve as hubs with strong connections to all other residues. Taken together, we propose a potential mechanism for this lectin complex formation where coevolving polar residues of high closeness are responsible for long-range recognition.

Assuntos

Modelos Moleculares , Lectinas de Plantas/química , Lectinas de Plantas/metabolismo , Conformação Proteica , Multimerização Proteica , Algoritmos , Sítios de Ligação , Manose/química , Manose/metabolismo , Ligação Proteica , Relação Estrutura-Atividade

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA