Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 17 de 17
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
2.
Sci Data ; 7(1): 178, 2020 06 16.
Artigo em Inglês | MEDLINE | ID: mdl-32546682

RESUMO

A vast amount of public RNA-sequencing datasets have been generated and used widely to study transcriptome mechanisms. These data offer precious opportunity for advancing biological research in transcriptome studies such as alternative splicing. We report the first large-scale integrated analysis of RNA-Seq data of splicing factors for systematically identifying key factors in diseases and biological processes. We analyzed 1,321 RNA-Seq libraries of various mouse tissues and cell lines, comprising more than 6.6 TB sequences from 75 independent studies that experimentally manipulated 56 splicing factors. Using these data, RNA splicing signatures and gene expression signatures were computed, and signature comparison analysis identified a list of key splicing factors in Rett syndrome and cold-induced thermogenesis. We show that cold-induced RNA-binding proteins rescue the neurite outgrowth defects in Rett syndrome using neuronal morphology analysis, and we also reveal that SRSF1 and PTBP1 are required for energy expenditure in adipocytes using metabolic flux analysis. Our study provides an integrated analysis for identifying key factors in diseases and biological processes and highlights the importance of public data resources for identifying hypotheses for experimental testing.


Assuntos
Fatores de Processamento de RNA , RNA-Seq , Adipócitos/metabolismo , Processamento Alternativo , Animais , Linhagem Celular , Temperatura Baixa , Conjuntos de Dados como Assunto , Ribonucleoproteínas Nucleares Heterogêneas/genética , Camundongos , Proteína de Ligação a Regiões Ricas em Polipirimidinas/genética , Síndrome de Rett/genética , Fatores de Processamento de Serina-Arginina/genética , Termogênese/genética , Transcriptoma
3.
IEEE/ACM Trans Comput Biol Bioinform ; 17(5): 1483-1492, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-31562101

RESUMO

RNA-Protein binding is involved in many different biological processes. With the progress of technology, more and more data are available for research. Based on these data, many prediction methods have been proposed to predict RNA-Protein binding preference. Some of these methods use only RNA sequence features for prediction, and some methods use multiple features for prediction. But, the performance of these methods is not satisfactory. In this study, we propose an improved capsule network to predict RNA-protein binding preferences, which can use both RNA sequence features and structure features. Experimental results show that our proposed method iCapsule performs better than three baseline methods in this field. We used both RNA sequence features and structure features in the model, so we tested the effect of primary capsule layer changes on model performance. In addition, we also studied the impact of model structure on model performance by performing our proposed method with different number of convolution layers and different kernel sizes.


Assuntos
Biologia Computacional/métodos , Redes Neurais de Computação , Proteínas de Ligação a RNA , RNA , Algoritmos , RNA/química , RNA/metabolismo , Proteínas de Ligação a RNA/química , Proteínas de Ligação a RNA/metabolismo
4.
IEEE/ACM Trans Comput Biol Bioinform ; 17(5): 1741-1750, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-30990191

RESUMO

RNA-Protein binding plays important roles in the field of gene expression. With the development of high throughput sequencing, several conventional methods and deep learning-based methods have been proposed to predict the binding preference of RNA-protein binding. These methods can hardly meet the need of consideration of the dependencies between subsequence and the various motif lengths of different translation factors (TFs). To overcome such limitations, we propose a predictive model that utilizes a combination of multi-scale convolutional layers and bidirectional gated recurrent unit (GRU) layer. Multi-scale convolution layer has the ability to capture the motif features of different lengths, and bidirectional GRU layer is able to capture the dependencies among subsequence. Experimental results show that the proposed method performs better than four state-of-the-art methods in this field. In addition, we investigate the effect of model structure on model performance by performing our proposed method with a different convolution layer and a different number of kernel size. We also demonstrate the effectiveness of bidirectional GRU in improving model performance through comparative experiments.


Assuntos
Biologia Computacional/métodos , Redes Neurais de Computação , Proteínas de Ligação a RNA , RNA , Algoritmos , Sítios de Ligação/genética , Ligação Proteica , RNA/química , RNA/genética , RNA/metabolismo , Proteínas de Ligação a RNA/química , Proteínas de Ligação a RNA/genética , Proteínas de Ligação a RNA/metabolismo
5.
Artigo em Inglês | MEDLINE | ID: mdl-30296239

RESUMO

Liver cancer is one of the deadliest cancers in the world. To find effective therapies for this cancer, it is indispensable to identify key genes, which may play critical roles in the incidence of the liver cancer. To identify key genes of the liver cancer with high accuracy, we integrated multiple microarray gene expression data sets to compute common differentially expressed genes, which will result more accurate than those from individual data set. To find the main functions or pathways that these genes are involved in, some enrichment analyses were performed including functional enrichment analysis, pathway enrichment analysis, and disease association study. Based on these genes, a protein-protein interaction network was constructed and analyzed to identify key genes of the liver cancer by combining the local and global influence of nodes in the network. The identified key genes, such as TOP2A, ESR1, and KMO, have been demonstrated to be key biomarkers of the liver cancer in many publications. All the results suggest that our method can effectively identify key genes of the liver cancer. Moreover, our method can be applied to other types of data sets to select key genes of other complex diseases.

6.
Database (Oxford) ; 20182018 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-29931156

RESUMO

RNA-binding proteins (RBPs) may play a critical role in gene regulation in various diseases or biological processes by controlling post-transcriptional events such as polyadenylation, splicing and mRNA stabilization via binding activities to RNA molecules. Owing to the importance of RBPs in gene regulation, a great number of studies have been conducted, resulting in a large amount of RNA-Seq datasets. However, these datasets usually do not have structured organization of metadata, which limits their potentially wide use. To bridge this gap, the metadata of a comprehensive set of publicly available mouse RNA-Seq datasets with perturbed RBPs were collected and integrated into a database called RBPMetaDB. This database contains 292 mouse RNA-Seq datasets for a comprehensive list of 187 RBPs. These RBPs account for only ∼10% of all known RBPs annotated in Gene Ontology, indicating that most are still unexplored using high-throughput sequencing. This negative information provides a great pool of candidate RBPs for biologists to conduct future experimental studies. In addition, we found that DNA-binding activities are significantly enriched among RBPs in RBPMetaDB, suggesting that prior studies of these DNA- and RNA-binding factors focus more on DNA-binding activities instead of RNA-binding activities. This result reveals the opportunity to efficiently reuse these data for investigation of the roles of their RNA-binding activities. A web application has also been implemented to enable easy access and wide use of RBPMetaDB. It is expected that RBPMetaDB will be a great resource for improving understanding of the biological roles of RBPs.Database URL: http://rbpmetadb.yubiolab.org.


Assuntos
Bases de Dados Genéticas , Anotação de Sequência Molecular , Proteínas de Ligação a RNA/metabolismo , Análise de Sequência de RNA , Animais , Internet , Camundongos , Domínios Proteicos , PubMed , Publicações , Estatística como Assunto , Interface Usuário-Computador
7.
Database (Oxford) ; 20182018 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-29688375

RESUMO

Abstract: Cold-induced thermogenesis increases energy expenditure and can reduce body weight in mammals, so the genes involved in it are thought to be potential therapeutic targets for treating obesity and diabetes. In the quest for more effective therapies, a great deal of research has been conducted to elucidate the regulatory mechanism of cold-induced thermogenesis. Over the last decade, a large number of genes that can enhance or suppress cold-induced thermogenesis have been discovered, but a comprehensive list of these genes is lacking. To fill this gap, we examined all of the annotated human and mouse genes and curated those demonstrated to enhance or suppress cold-induced thermogenesis by in vivo or ex vivo experiments in mice. The results of this highly accurate and comprehensive annotation are hosted on a database called CITGeneDB, which includes a searchable web interface to facilitate broad public use. The database will be updated as new genes are found to enhance or suppress cold-induced thermogenesis. It is expected that CITGeneDB will be a valuable resource in future explorations of the molecular mechanism of cold-induced thermogenesis, helping pave the way for new obesity and diabetes treatments. Database URL: http://citgenedb.yubiolab.org.


Assuntos
Temperatura Baixa , Bases de Dados de Ácidos Nucleicos , Regulação da Expressão Gênica/fisiologia , Anotação de Sequência Molecular , Termogênese/fisiologia , Animais , Humanos , Camundongos
8.
Artigo em Inglês | MEDLINE | ID: mdl-28368812

RESUMO

In recent years, a remarkable amount of protein-protein interaction (PPI) data are being available owing to the advance made in experimental high-throughput technologies. However, the experimentally detected PPI data usually contain a large amount of spurious links, which could contaminate the analysis of the biological significance of protein links and lead to incorrect biological discoveries, thereby posing new challenges to both computational and biological scientists. In this paper, we develop a new embedding algorithm called local similarity preserving embedding (LSPE) to rank the interaction possibility of protein links. By going beyond limitations of current geometric embedding methods for network denoising and emphasizing the local information of PPI networks, LSPE can avoid the unstableness of previous methods. We demonstrate experimental results on benchmark PPI networks and show that LSPE was the overall leader, outperforming the state-of-the-art methods in topological false links elimination problems.


Assuntos
Biologia Computacional/métodos , Mapeamento de Interação de Proteínas/métodos , Mapas de Interação de Proteínas/fisiologia , Algoritmos , Simulação por Computador , Bases de Dados de Proteínas , Proteínas de Saccharomyces cerevisiae
9.
IEEE/ACM Trans Comput Biol Bioinform ; 14(5): 1147-1153, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28113675

RESUMO

In this study, in order to take advantage of complementary information from different types of data for better disease status diagnosis, we combined gene expression with DNA methylation data and generated a fused network, based on which the stages of Kidney Renal Cell Carcinoma (KIRC) can be better identified. It is well recognized that a network is important for investigating the connectivity of disease groups. We exploited the potential of the network's features to identify the KIRC stage. We first constructed a patient network from each type of data. We then built a fused network based on network fusion method. Based on the link weights of patients, we used a generalized linear model to predict the group of KIRC subjects. Finally, the group prediction method was applied to test the power of network-based features. The performance (e.g., the accuracy of identifying cancer stages) when using the fused network from two types of data is shown to be superior to that when using two patient networks from only one data type. The work provides a good example for using network based features from multiple data types for a more comprehensive diagnosis.


Assuntos
Biomarcadores Tumorais/genética , Carcinoma de Células Renais , Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Neoplasias Renais , Biomarcadores Tumorais/análise , Carcinoma de Células Renais/classificação , Carcinoma de Células Renais/diagnóstico , Carcinoma de Células Renais/genética , Metilação de DNA/genética , Bases de Dados Genéticas , Humanos , Neoplasias Renais/classificação , Neoplasias Renais/diagnóstico , Neoplasias Renais/genética
10.
Artigo em Inglês | MEDLINE | ID: mdl-26886732

RESUMO

In recent years, thanks to the efforts of individual scientists and research consortiums, a huge amount of chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) experimental data have been accumulated. Instead of investigating them independently, several recent studies have convincingly demonstrated that a wealth of scientific insights can be gained by integrative analysis of these ChIP-seq data. However, when used for the purpose of integrative analysis, a serious drawback of current ChIP-seq technique is that it is still expensive and time-consuming to generate ChIP-seq datasets of high standard. Most researchers are therefore unable to obtain complete ChIP-seq data for several TFs in a wide variety of cell lines, which considerably limits the understanding of transcriptional regulation pattern. In this paper, we propose a novel method called ChIP-PIT to overcome the aforementioned limitation. In ChIP-PIT, ChIP-seq data corresponding to a diverse collection of cell types, TFs and genes are fused together using the three-mode pair-wise interaction tensor (PIT) model, and the prediction of unperformed ChIP-seq experimental results is formulated as a tensor completion problem. Computationally, we propose efficient first-order method based on extensions of coordinate descent method to learn the optimal solution of ChIP-PIT, which makes it particularly suitable for the analysis of massive scale ChIP-seq data. Experimental evaluation the ENCODE data illustrate the usefulness of the proposed model.


Assuntos
Algoritmos , Imunoprecipitação da Cromatina/métodos , Biologia Computacional/métodos , Aprendizado de Máquina , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Linhagem Celular , Bases de Dados Genéticas , Humanos , Modelos Teóricos , Reprodutibilidade dos Testes , Fatores de Transcrição/genética
11.
Artigo em Inglês | MEDLINE | ID: mdl-26415208

RESUMO

Cervical cancer is the third most common malignancy in women worldwide. It remains a leading cause of cancer-related death for women in developing countries. In order to contribute to the treatment of the cervical cancer, in our work, we try to find a few key genes resulting in the cervical cancer. Employing functions of several bioinformatics tools, we selected 143 differentially expressed genes (DEGs) associated with the cervical cancer. The results of bioinformatics analysis show that these DEGs play important roles in the development of cervical cancer. Through comparing two differential co-expression networks (DCNs) at two different states, we found a common sub-network and two differential sub-networks as well as some hub genes in three sub-networks. Moreover, some of the hub genes have been reported to be related to the cervical cancer. Those hub genes were analyzed from Gene Ontology function enrichment, pathway enrichment and protein binding three aspects. The results can help us understand the development of the cervical cancer and guide further experiments about the cervical cancer.


Assuntos
Regulação Neoplásica da Expressão Gênica/genética , Redes Reguladoras de Genes/genética , Neoplasias do Colo do Útero/genética , Biologia Computacional , Feminino , Perfilação da Expressão Gênica , Humanos
12.
Int J Data Min Bioinform ; 13(1): 63-74, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26529908

RESUMO

In order to analyse the similarity among microbial communities on functional state after assigning 16S rRNA sequences from all microbial communities to species. It's an important addition to the species-level relationship between two compared communities and can quantify their differences in function. We downloaded all functional annotation data of several microbiotas. It's developed to identify the functional distribution and the significantly enriched functional categories of microbial communities. We analysed the similarity between two microbial communities on functional state. In the experimental results, it shows that the semantic similarity can quantify the difference between two compared species on function level. It can analyse the function of microbial communities by gene ontology based on 16S rRNA gene. Exploration of the function relationship between two sets of species assemblages will be a key result of microbiome studies and may provide new insights into assembly of a wide range of ecosystems.


Assuntos
DNA Ribossômico/genética , Ontologia Genética , Consórcios Microbianos/genética , RNA Ribossômico 16S/genética , Análise de Sequência de RNA/métodos
13.
IEEE Trans Nanobioscience ; 14(5): 528-34, 2015 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-25861086

RESUMO

Protein-protein interactions (PPIs) play essential roles for determining the outcomes of most of the cellular functions of the cell. Although the experimentally detected high-throughput PPI data promise new opportunities for the study of many biological mechanisms including cellular metabolism and protein functions, experimentally detected PPIs have high levels of false positive rate. Therefore, it is of high practical value to develop novel computational tools for pruning low-confidence PPIs. In this paper, we propose a new geometric approach called Leave-One-Out Logistic Metric Embedding (LOO-LME) for assessing the reliability of interactions. Unlike previous approaches which mainly seek to preserve the noisy topological information of the PPI networks in the embedding space, LOO-LME first transforms the learning task into an equivalent discriminant form, then directly deals with the uncertainty in PPI networks using a leave-one-out-style approach. The experimental results show that LOO-LME substantially outperforms previous methods on PPI assessment problems. LOO-LME could thus facilitate further graph-based studies of PPIs and may help infer their hidden underlying biological knowledge.


Assuntos
Biologia Computacional/métodos , Mapeamento de Interação de Proteínas/métodos , Mapas de Interação de Proteínas , Algoritmos , Bases de Dados de Proteínas , Anotação de Sequência Molecular
14.
BMC Genomics ; 16 Suppl 3: S4, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25707808

RESUMO

BACKGROUND: Bladder cancer is the most common malignant tumor of the urinary system and it is a heterogeneous disease with both superficial and invasive growth. However, its aetiological agent is still unclear. And it is indispensable to find key genes or modules causing the bladder cancer. Based on gene expression microarray datasets, constructing differential co-expression networks (DCNs) is an important method to investigate diseases and there have been some relevant good tools such as R package 'WGCNA', 'DCGL'. RESULTS: Employing an integrated strategy, 36 up-regulated differentially expressed genes (DEGs) and 356 down-regulated DEGs were selected and main functions of those DEGs are cellular physiological precess(24 up-regulated DEGs; 167 down-regulated DEGs) and cellular metabolism (19 up-regulated DEGs; 104 down-regulated DEGs). The up-regulated DEGs are mainly involved in the the pathways related to "metabolism". By comparing two DCNs between the normal and cancer states, we found some great changes in hub genes and topological structure, which suggest that the modules of two different DCNs change a lot. Especially, we screened some hub genes of a differential subnetwork between the normal and the cancer states and then do bioinformatics analysis for them. CONCLUSIONS: Through constructing and analyzing two differential co-expression networks at different states using the screened DEGs, we found some hub genes associated with the bladder cancer. The results of the bioinformatics analysis for those hub genes will support the biological experiments and the further treatment of the bladder cancer.


Assuntos
Redes Reguladoras de Genes , Transcriptoma , Neoplasias da Bexiga Urinária/genética , Bases de Dados Genéticas , Regulação para Baixo , Humanos , Redes e Vias Metabólicas , Regulação para Cima
15.
BMC Bioinformatics ; 15 Suppl 15: S9, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25474679

RESUMO

BACKGROUND: Identifying protein-protein interactions (PPIs) is essential for elucidating protein functions and understanding the molecular mechanisms inside the cell. However, the experimental methods for detecting PPIs are both time-consuming and expensive. Therefore, computational prediction of protein interactions are becoming increasingly popular, which can provide an inexpensive way of predicting the most likely set of interactions at the entire proteome scale, and can be used to complement experimental approaches. Although much progress has already been achieved in this direction, the problem is still far from being solved and new approaches are still required to overcome the limitations of the current prediction models. RESULTS: In this work, a sequence-based approach is developed by combining a novel Multi-scale Continuous and Discontinuous (MCD) feature representation and Support Vector Machine (SVM). The MCD representation gives adequate consideration to the interactions between sequentially distant but spatially close amino acid residues, thus it can sufficiently capture multiple overlapping continuous and discontinuous binding patterns within a protein sequence. An effective feature selection method mRMR was employed to construct an optimized and more discriminative feature set by excluding redundant features. Finally, a prediction model is trained and tested based on SVM algorithm to predict the interaction probability of protein pairs. CONCLUSIONS: When performed on the yeast PPIs data set, the proposed approach achieved 91.36% prediction accuracy with 91.94% precision at the sensitivity of 90.67%. Extensive experiments are conducted to compare our method with the existing sequence-based method. Experimental results show that the performance of our predictor is better than several other state-of-the-art predictors, whose average prediction accuracy is 84.91%, sensitivity is 83.24%, and precision is 86.12%. Achieved results show that the proposed approach is very promising for predicting PPI, so it can be a useful supplementary tool for future proteomics studies. The source code and the datasets are freely available at http://csse.szu.edu.cn/staff/youzh/MCDPPI.zip for academic use.


Assuntos
Mapeamento de Interação de Proteínas/métodos , Análise de Sequência de Proteína/métodos , Proteínas de Saccharomyces cerevisiae/metabolismo , Máquina de Vetores de Suporte
16.
Methods ; 69(3): 207-12, 2014 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-25132640

RESUMO

The R package SFAPS has been developed for structure/function analysis of protein sequences based on information spectrum method. The informational spectrum method employs the electron-ion interaction potential parameter as the numerical representation for the protein sequence, and obtains the characteristic frequency of a particular protein interaction after computing the Discrete Fourier Transform for protein sequences. The informational spectrum method is often used to analyze protein sequences, so we developed this software tool, which is implemented as an add-on package to the freely available and widely used statistical language R. Our package is distributed as open source code for Linux, Unix and Microsoft Windows. It is released under the GNU General Public License. The R package along with its source code and additional material are freely available at http://mlsbl.tongji.edu.cn/DBdownload.asp.


Assuntos
Sequência de Aminoácidos/genética , Biologia Computacional/métodos , Software , Análise de Sequência de Proteína
17.
Yao Xue Xue Bao ; 46(6): 613-21, 2011 Jun.
Artigo em Chinês | MEDLINE | ID: mdl-21882519

RESUMO

In recent years, antibiotic resistance of bacteria has become a global health crisis. Especially, the new class of "superbug" was found in South Asia, which is resistant to almost known antibiotics and causes worldwide alarm. Through the underlying mechanisms of bacterial pathogenecity, the expression of many pathogen virulence factors is regulated by the process of quorum sensing. Screening efficient quorum sensing inhibitors is an especially compelling approach to the future treatment of bacterial infections and antibiotic resistance. This article focuses on bacterial quorum sensing system, quorum sensing screening model for in vitro and evaluation of animal models in vivo, recent research of quorum sensing inhibitors and so on.


Assuntos
Antibacterianos/farmacologia , Infecções Bacterianas , Farmacorresistência Bacteriana , Pseudomonas aeruginosa/fisiologia , Percepção de Quorum/efeitos dos fármacos , Animais , Antibacterianos/uso terapêutico , Infecções Bacterianas/tratamento farmacológico , Modelos Animais de Doenças , Medicamentos de Ervas Chinesas/farmacologia , Humanos , Pseudomonas aeruginosa/efeitos dos fármacos , Pseudomonas aeruginosa/patogenicidade , Percepção de Quorum/fisiologia , Virulência/efeitos dos fármacos , Fatores de Virulência/metabolismo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...