Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 12 de 12
Filter
Add more filters










Publication year range
1.
Comput Struct Biotechnol J ; 19: 3015-3026, 2021.
Article in English | MEDLINE | ID: mdl-34136099

ABSTRACT

RNA modifications, in particular N 6-methyladenosine (m6A), participate in every stages of RNA metabolism and play diverse roles in essential biological processes and disease pathogenesis. Thanks to the advances in sequencing technology, tens of thousands of RNA modification sites can be identified in a typical high-throughput experiment; however, it remains a major challenge to decipher the functional relevance of these sites, such as, affecting alternative splicing, regulation circuit in essential biological processes or association to diseases. As the focus of RNA epigenetics gradually shifts from site discovery to functional studies, we review here recent progress in functional annotation and prediction of RNA modification sites from a bioinformatics perspective. The review covers naïve annotation with associated biological events, e.g., single nucleotide polymorphism (SNP), RNA binding protein (RBP) and alternative splicing, prediction of key sites and their regulatory functions, inference of disease association, and mining the diagnosis and prognosis value of RNA modification regulators. We further discussed the limitations of existing approaches and some future perspectives.

2.
Bioinformatics ; 37(22): 4277-4279, 2021 11 18.
Article in English | MEDLINE | ID: mdl-33974000

ABSTRACT

MOTIVATION: N 6-methyladenosine (m6A) is the most abundant mammalian mRNA methylation with versatile functions. To date, although a number of bioinformatics tools have been developed for location discovery of m6A modification, functional understanding is still quite limited. As the focus of RNA epigenetics gradually shifts from site discovery to functional studies, there is an urgent need for user-friendly tools to identify and explore the functional relevance of context-specific m6A methylation to gain insights into the epitranscriptome layer of gene expression regulation. RESULTS: We introduced here Funm6AViewer, a novel platform to identify, prioritize and visualize the functional gene interaction networks mediated by dynamic m6A RNA methylation unveiled from a case control study. By taking the differential RNA methylation data and differential gene expression data, both of which can be inferred from the widely used MeRIP-seq data, as the inputs, Funm6AViewer enables a series of analysis, including: (i) examining the distribution of differential m6A sites, (ii) prioritizing the genes mediated by dynamic m6A methylation and (iii) characterizing functionally the gene regulatory networks mediated by condition-specific m6A RNA methylation. Funm6AViewer should effectively facilitate the understanding of the epitranscriptome circuitry mediated by this reversible RNA modification. AVAILABILITY AND IMPLEMENTATION: Funm6AViewer is available both as a convenient web server (https://www.xjtlu.edu.cn/biologicalsciences/funm6aviewer) with graphical interface and as an independent R package (https://github.com/NWPU-903PR/Funm6AViewer) for local usage. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Epigenesis, Genetic , RNA , Animals , Methylation , Case-Control Studies , RNA/metabolism , Gene Regulatory Networks , Adenosine/metabolism , Mammals/genetics
3.
Inorg Chem ; 59(22): 16582-16590, 2020 Nov 16.
Article in English | MEDLINE | ID: mdl-33113329

ABSTRACT

Several types of air-stable N,O-coordinate half-sandwich iridium complexes containing Schiff base ligands with the general formula [Cp*IrClL] were synthesized in good yields. These stable iridium complexes displayed a good catalytic efficiency in amide synthesis. A variety of amides with different substituents were obtained in a one-pot procedure with excellent yields and high selectivities through the amidation of aldehydes with NH2OH·HCl and nitrile hydration under the catalysis of complexes 1-4. The excellent and diverse catalytic activity, mild conditions, broad substance scope, and environmentally friendly solvent make this system potentially applicable in industrial production. Half-sandwich iridium complexes 1-4 were characterized by NMR, elemental analysis, and IR techniques. Molecular structures of complexes 2 and 3 were confirmed by single-crystal X-ray analysis.

4.
Int J Mol Sci ; 21(15)2020 Jul 23.
Article in English | MEDLINE | ID: mdl-32718000

ABSTRACT

Long non-coding RNAs (lncRNAs) play crucial roles in diverse biological processes and human complex diseases. Distinguishing lncRNAs from protein-coding transcripts is a fundamental step for analyzing the lncRNA functional mechanism. However, the experimental identification of lncRNAs is expensive and time-consuming. In this study, we presented an alignment-free multimodal deep learning framework (namely lncRNA_Mdeep) to distinguish lncRNAs from protein-coding transcripts. LncRNA_Mdeep incorporated three different input modalities, then a multimodal deep learning framework was built for learning the high-level abstract representations and predicting the probability whether a transcript was lncRNA or not. LncRNA_Mdeep achieved 98.73% prediction accuracy in a 10-fold cross-validation test on humans. Compared with other eight state-of-the-art methods, lncRNA_Mdeep showed 93.12% prediction accuracy independent test on humans, which was 0.94%~15.41% higher than that of other eight methods. In addition, the results on 11 cross-species datasets showed that lncRNA_Mdeep was a powerful predictor for predicting lncRNAs.


Subject(s)
Databases, Nucleic Acid , Deep Learning , RNA, Long Noncoding/genetics , Software , Animals , Humans , Mice
5.
Anal Biochem ; 601: 113767, 2020 07 15.
Article in English | MEDLINE | ID: mdl-32454029

ABSTRACT

Long noncoding RNAs (lncRNAs) play critical roles in many pathological and biological processes, such as post-transcription, cell differentiation and gene regulation. Increasingly more studies have shown that lncRNAs function through mainly interactions with specific RNA binding proteins (RBPs). However, experimental identification of potential lncRNA-protein interactions is costly and time-consuming. In this work, we propose a novel convolutional neural network-based method with the copy-padding trick (named LPI-CNNCP) to predict lncRNA-protein interactions. The copy-padding trick of the LPI-CNNCP convert the protein/RNA sequences with variable-length into the fixed-length sequences, thus enabling the construction of the CNN model. A high-order one-hot encoding is also applied to transform the protein/RNA sequences into image-like inputs for capturing the dependencies among amino acids (or nucleotides). In the end, these encoded protein/RNA sequences are feed into a CNN to predict the lncRNA-protein interactions. Compared with other state-of-the-art methods in 10-fold cross-validation (10CV) test, LPI-CNNCP shows the best performance. Results in the independent test demonstrate that our LPI-CNNCP can effectively predict the potential lncRNA-protein interactions. We also compared the copy-padding trick with two other existing tricks (i.e., zero-padding and cropping), and the results show that our copy-padding rick outperforms the zero-padding and cropping tricks on predicting lncRNA-protein interactions. The source code of LPI-CNNCP and the datasets used in this work are available at https://github.com/NWPU-903PR/LPI-CNNCP for academic users.


Subject(s)
Neural Networks, Computer , RNA, Long Noncoding/chemistry , RNA-Binding Proteins/chemistry , Amino Acid Sequence , Humans
6.
Inorg Chem ; 59(7): 4800-4809, 2020 Apr 06.
Article in English | MEDLINE | ID: mdl-32212643

ABSTRACT

Several N,O-coordinate half-sandwich iridium complexes, 1-5, containing constrained bulky ß-enaminoketonato ligands were prepared and clearly characterized. Single-crystal X-ray diffraction characterization of these complexes indicates that the iridium center adopts a distorted octahedral geometry. Complexes 1-5 showed good catalytic efficiency in the oxidative homocoupling of primary amines, dehydrogenation of secondary amines, and the oxidative cross-coupling of amines and alcohols, which furnished various types of imines in good yields and high selectivities using O2 as an oxidant under mild conditions. No distinctive substituent effects of the iridium catalysts were observed in these reactions. The diverse catalytic activity, broad substrate scope, mild reaction conditions, and high yields of the products made this catalytic system attractive in industrial processes.

7.
Bioinformatics ; 35(14): i90-i98, 2019 07 15.
Article in English | MEDLINE | ID: mdl-31510685

ABSTRACT

MOTIVATION: As the most abundant mammalian mRNA methylation, N6-methyladenosine (m6A) exists in >25% of human mRNAs and is involved in regulating many different aspects of mRNA metabolism, stem cell differentiation and diseases like cancer. However, our current knowledge about dynamic changes of m6A levels and how the change of m6A levels for a specific gene can play a role in certain biological processes like stem cell differentiation and diseases like cancer is largely elusive. RESULTS: To address this, we propose in this paper FunDMDeep-m6A a novel pipeline for identifying context-specific (e.g. disease versus normal, differentiated cells versus stem cells or gene knockdown cells versus wild-type cells) m6A-mediated functional genes. FunDMDeep-m6A includes, at the first step, DMDeep-m6A a novel method based on a deep learning model and a statistical test for identifying differential m6A methylation (DmM) sites from MeRIP-Seq data at a single-base resolution. FunDMDeep-m6A then identifies and prioritizes functional DmM genes (FDmMGenes) by combing the DmM genes (DmMGenes) with differential expression analysis using a network-based method. This proposed network method includes a novel m6A-signaling bridge (MSB) score to quantify the functional significance of DmMGenes by assessing functional interaction of DmMGenes with their signaling pathways using a heat diffusion process in protein-protein interaction (PPI) networks. The test results on 4 context-specific MeRIP-Seq datasets showed that FunDMDeep-m6A can identify more context-specific and functionally significant FDmMGenes than m6A-Driver. The functional enrichment analysis of these genes revealed that m6A targets key genes of many important context-related biological processes including embryonic development, stem cell differentiation, transcription, translation, cell death, cell proliferation and cancer-related pathways. These results demonstrate the power of FunDMDeep-m6A for elucidating m6A regulatory functions and its roles in biological processes and diseases. AVAILABILITY AND IMPLEMENTATION: The R-package for DMDeep-m6A is freely available from https://github.com/NWPU-903PR/DMDeepm6A1.0. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Neoplasms , Protein Interaction Maps , RNA , Animals , Humans , Methylation , Neoplasms/genetics , RNA, Messenger , Software
8.
BMC Bioinformatics ; 20(1): 87, 2019 Feb 19.
Article in English | MEDLINE | ID: mdl-30782113

ABSTRACT

BACKGROUND: Long non-coding RNAs play an important role in human complex diseases. Identification of lncRNA-disease associations will gain insight into disease-related lncRNAs and benefit disease diagnoses and treatment. However, using experiments to explore the lncRNA-disease associations is expensive and time consuming. RESULTS: In this study, we developed a novel method to identify potential lncRNA-disease associations by Integrating Diverse Heterogeneous Information sources with positive pointwise Mutual Information and Random Walk with restart algorithm (namely IDHI-MIRW). IDHI-MIRW first constructs multiple lncRNA similarity networks and disease similarity networks from diverse lncRNA-related and disease-related datasets, then implements the random walk with restart algorithm on these similarity networks for extracting the topological similarities which are fused with positive pointwise mutual information to build a large-scale lncRNA-disease heterogeneous network. Finally, IDHI-MIRW implemented random walk with restart algorithm on the lncRNA-disease heterogeneous network to infer potential lncRNA-disease associations. CONCLUSIONS: Compared with other state-of-the-art methods, IDHI-MIRW achieves the best prediction performance. In case studies of breast cancer, stomach cancer, and colorectal cancer, 36/45 (80%) novel lncRNA-disease associations predicted by IDHI-MIRW are supported by recent literatures. Furthermore, we found lncRNA LINC01816 is associated with the survival of colorectal cancer patients. IDHI-MIRW is freely available at https://github.com/NWPU-903PR/IDHI-MIRW .


Subject(s)
Algorithms , Computational Biology/methods , Genetic Predisposition to Disease , RNA, Long Noncoding/genetics , Colorectal Neoplasms/genetics , Genetic Association Studies , Humans , Sequence Analysis, RNA
9.
PLoS Comput Biol ; 15(1): e1006663, 2019 01.
Article in English | MEDLINE | ID: mdl-30601803

ABSTRACT

N6-methyladenosine (m6A) is the most abundant methylation, existing in >25% of human mRNAs. Exciting recent discoveries indicate the close involvement of m6A in regulating many different aspects of mRNA metabolism and diseases like cancer. However, our current knowledge about how m6A levels are controlled and whether and how regulation of m6A levels of a specific gene can play a role in cancer and other diseases is mostly elusive. We propose in this paper a computational scheme for predicting m6A-regulated genes and m6A-associated disease, which includes Deep-m6A, the first model for detecting condition-specific m6A sites from MeRIP-Seq data with a single base resolution using deep learning and Hot-m6A, a new network-based pipeline that prioritizes functional significant m6A genes and its associated diseases using the Protein-Protein Interaction (PPI) and gene-disease heterogeneous networks. We applied Deep-m6A and this pipeline to 75 MeRIP-seq human samples, which produced a compact set of 709 functionally significant m6A-regulated genes and nine functionally enriched subnetworks. The functional enrichment analysis of these genes and networks reveal that m6A targets key genes of many critical biological processes including transcription, cell organization and transport, and cell proliferation and cancer-related pathways such as Wnt pathway. The m6A-associated disease analysis prioritized five significantly associated diseases including leukemia and renal cell carcinoma. These results demonstrate the power of our proposed computational scheme and provide new leads for understanding m6A regulatory functions and its roles in diseases.


Subject(s)
Adenosine/analogs & derivatives , Computational Biology/methods , Genetic Markers/genetics , Neoplasms/genetics , Software , Adenosine/genetics , Algorithms , Deep Learning , Humans , Neoplasms/metabolism , Protein Interaction Maps/genetics
10.
Med Chem ; 13(6): 515-525, 2017.
Article in English | MEDLINE | ID: mdl-28494725

ABSTRACT

BACKGROUND: RNA-protein interactions (RPIs) play an important role in many cellular processes. In particular, noncoding RNA-protein interactions (ncRPIs) are involved in various gene regulations and human complex diseases. High-throughput experiments have provided a large number of valuable information about ncRPIs, but these experiments are expensive and timeconsuming. Therefore, some computational approaches have been developed to predict ncRPIs efficiently and effectively. METHODS: In this work, we will describe the recent advance of predicting ncRPIs from the following aspects: i) the dataset construction; ii) the sequence and structural feature representation, and iii) the machine learning algorithm. RESULTS: The current methods have successfully predicted ncRPIs, but most of them trained and tested on the small benchmark datasets derived from ncRNA-protein complexes in PDB database. The generalization performance and robust of these existing methods need to be further improved. CONCLUSION: Concomitant with the large numbers of ncRPIs generated by high-throughput technologies, three future directions for predicting ncRPIs with machine learning should be paid attention. One direction is that how to effectively construct the negative sample set. Another is the selection of novel and effective features from the sequences and structures of ncRNAs and proteins. The third is the design of powerful predictor.


Subject(s)
Computational Biology/methods , Proteins/metabolism , RNA, Untranslated/metabolism , Humans , Internet , Machine Learning , Protein Binding
11.
Mol Biosyst ; 11(3): 892-7, 2015 Mar.
Article in English | MEDLINE | ID: mdl-25588719

ABSTRACT

Long noncoding RNAs (lncRNAs) are emerging as a novel class of noncoding RNAs and potent gene regulators, which play an important and varied role in cellular functions. lncRNAs are closely related with the occurrence and development of some diseases. High-throughput RNA-sequencing techniques combined with de novo assembly have identified a large number of novel transcripts. The discovery of large and 'hidden' transcriptomes urgently requires the development of effective computational methods that can rapidly distinguish between coding and long noncoding RNAs. In this study, we developed a powerful predictor (named as lncRNA-MFDL) to identify lncRNAs by fusing multiple features of the open reading frame, k-mer, the secondary structure and the most-like coding domain sequence and using deep learning classification algorithms. Using the same human training dataset and a 10-fold cross validation test, lncRNA-MFDL can achieve 97.1% prediction accuracy which is 5.7, 3.7, and 3.4% higher than that of CPC, CNCI and lncRNA-FMFSVM predictors, respectively. Compared with CPC and CNCI predictors in other species (e.g., anole lizard, zebrafish, chicken, gorilla, macaque, mouse, lamprey, orangutan, xenopus and C. elegans) testing datasets, the new lncRNA-MFDL predictor is also much more effective and robust. These results show that lncRNA-MFDL is a powerful tool for identifying lncRNAs. The lncRNA-MFDL software package is freely available at for academic users.


Subject(s)
Computational Biology/methods , RNA, Long Noncoding , Software , Algorithms , Humans , RNA, Long Noncoding/chemistry , RNA, Long Noncoding/genetics , Reproducibility of Results
12.
Anal Biochem ; 449: 164-71, 2014 Mar 15.
Article in English | MEDLINE | ID: mdl-24361712

ABSTRACT

Revealing the subcellular location of newly discovered protein sequences can bring insight to their function and guide research at the cellular level. The rapidly increasing number of sequences entering the genome databanks has called for the development of automated analysis methods. Currently, most existing methods used to predict protein subcellular locations cover only one, or a very limited number of species. Therefore, it is necessary to develop reliable and effective computational approaches to further improve the performance of protein subcellular prediction and, at the same time, cover more species. The current study reports the development of a novel predictor called MSLoc-DT to predict the protein subcellular locations of human, animal, plant, bacteria, virus, fungi, and archaea by introducing a novel feature extraction approach termed Amino Acid Index Distribution (AAID) and then fusing gene ontology information, sequential evolutionary information, and sequence statistical information through four different modes of pseudo amino acid composition (PseAAC) with a decision template rule. Using the jackknife test, MSLoc-DT can achieve 86.5, 98.3, 90.3, 98.5, 95.9, 98.1, and 99.3% overall accuracy for human, animal, plant, bacteria, virus, fungi, and archaea, respectively, on seven stringent benchmark datasets. Compared with other predictors (e.g., Gpos-PLoc, Gneg-PLoc, Virus-PLoc, Plant-PLoc, Plant-mPLoc, ProLoc-Go, Hum-PLoc, GOASVM) on the gram-positive, gram-negative, virus, plant, eukaryotic, and human datasets, the new MSLoc-DT predictor is much more effective and robust. Although the MSLoc-DT predictor is designed to predict the single location of proteins, our method can be extended to multiple locations of proteins by introducing multilabel machine learning approaches, such as the support vector machine and deep learning, as substitutes for the K-nearest neighbor (KNN) method. As a user-friendly web server, MSLoc-DT is freely accessible at http://bioinfo.ibp.ac.cn/MSLOC_DT/index.html.


Subject(s)
Artificial Intelligence , Computational Biology/methods , Proteins/analysis , Subcellular Fractions/chemistry , Amino Acid Sequence , Animals , Databases, Protein , Gene Ontology , Humans , Molecular Sequence Data
SELECTION OF CITATIONS
SEARCH DETAIL
...