Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Bioinformatics ; 36(8): 2628-2629, 2020 04 15.
Artigo em Inglês | MEDLINE | ID: mdl-31882993

RESUMO

MOTIVATION: Gene lists are routinely produced from various omic studies. Enrichment analysis can link these gene lists with underlying molecular pathways and functional categories such as gene ontology (GO) and other databases. RESULTS: To complement existing tools, we developed ShinyGO based on a large annotation database derived from Ensembl and STRING-db for 59 plant, 256 animal, 115 archeal and 1678 bacterial species. ShinyGO's novel features include graphical visualization of enrichment results and gene characteristics, and application program interface access to KEGG and STRING for the retrieval of pathway diagrams and protein-protein interaction networks. ShinyGO is an intuitive, graphical web application that can help researchers gain actionable insights from gene-sets. AVAILABILITY AND IMPLEMENTATION: http://ge-lab.org/go/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Bases de Dados Genéticas , Software , Animais , Biologia Computacional , Bases de Dados Factuais , Ontologia Genética , Internet , Probabilidade
2.
BMC Bioinformatics ; 19(1): 534, 2018 Dec 19.
Artigo em Inglês | MEDLINE | ID: mdl-30567491

RESUMO

BACKGROUND: RNA-seq is widely used for transcriptomic profiling, but the bioinformatics analysis of resultant data can be time-consuming and challenging, especially for biologists. We aim to streamline the bioinformatic analyses of gene-level data by developing a user-friendly, interactive web application for exploratory data analysis, differential expression, and pathway analysis. RESULTS: iDEP (integrated Differential Expression and Pathway analysis) seamlessly connects 63 R/Bioconductor packages, 2 web services, and comprehensive annotation and pathway databases for 220 plant and animal species. The workflow can be reproduced by downloading customized R code and related pathway files. As an example, we analyzed an RNA-Seq dataset of lung fibroblasts with Hoxa1 knockdown and revealed the possible roles of SP1 and E2F1 and their target genes, including microRNAs, in blocking G1/S transition. In another example, our analysis shows that in mouse B cells without functional p53, ionizing radiation activates the MYC pathway and its downstream genes involved in cell proliferation, ribosome biogenesis, and non-coding RNA metabolism. In wildtype B cells, radiation induces p53-mediated apoptosis and DNA repair while suppressing the target genes of MYC and E2F1, and leads to growth and cell cycle arrest. iDEP helps unveil the multifaceted functions of p53 and the possible involvement of several microRNAs such as miR-92a, miR-504, and miR-30a. In both examples, we validated known molecular pathways and generated novel, testable hypotheses. CONCLUSIONS: Combining comprehensive analytic functionalities with massive annotation databases, iDEP ( http://ge-lab.org/idep/ ) enables biologists to easily translate transcriptomic and proteomic data into actionable insights.


Assuntos
Biologia Computacional/métodos , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de RNA/métodos , Software , Animais , Linfócitos B/citologia , Linfócitos B/metabolismo , Proliferação de Células , Células Cultivadas , Fibroblastos/citologia , Fibroblastos/metabolismo , Proteínas de Homeodomínio/antagonistas & inibidores , Humanos , Pulmão/citologia , Pulmão/metabolismo , Camundongos , RNA Interferente Pequeno/genética , Fatores de Transcrição/antagonistas & inibidores , Transcriptoma , Proteína Supressora de Tumor p53/genética , Proteína Supressora de Tumor p53/metabolismo
3.
BMC Genomics ; 18(1): 200, 2017 02 23.
Artigo em Inglês | MEDLINE | ID: mdl-28231763

RESUMO

BACKGROUND: Instead of testing predefined hypotheses, the goal of exploratory data analysis (EDA) is to find what data can tell us. Following this strategy, we re-analyzed a large body of genomic data to study the complex gene regulation in mouse pre-implantation development (PD). RESULTS: Starting with a single-cell RNA-seq dataset consisting of 259 mouse embryonic cells derived from zygote to blastocyst stages, we reconstructed the temporal and spatial gene expression pattern during PD. The dynamics of gene expression can be partially explained by the enrichment of transposable elements in gene promoters and the similarity of expression profiles with those of corresponding transposons. Long Terminal Repeats (LTRs) are associated with transient, strong induction of many nearby genes at the 2-4 cell stages, probably by providing binding sites for Obox and other homeobox factors. B1 and B2 SINEs (Short Interspersed Nuclear Elements) are correlated with the upregulation of thousands of nearby genes during zygotic genome activation. Such enhancer-like effects are also found for human Alu and bovine tRNA SINEs. SINEs also seem to be predictive of gene expression in embryonic stem cells (ESCs), raising the possibility that they may also be involved in regulating pluripotency. We also identified many potential transcription factors underlying PD and discussed the evolutionary necessity of transposons in enhancing genetic diversity, especially for species with longer generation time. CONCLUSIONS: Together with other recent studies, our results provide further evidence that many transposable elements may play a role in establishing the expression landscape in early embryos. It also demonstrates that exploratory bioinformatics investigation can pinpoint developmental pathways for further study, and serve as a strategy to generate novel insights from big genomic data.


Assuntos
Biologia Computacional , DNA Intergênico , Desenvolvimento Embrionário/genética , Regulação da Expressão Gênica no Desenvolvimento , Animais , Sequência de Bases , Análise por Conglomerados , Biologia Computacional/métodos , Ilhas de CpG , Metilação de DNA , Elementos de DNA Transponíveis , Células-Tronco Embrionárias/metabolismo , Perfilação da Expressão Gênica , Genoma , Genômica/métodos , Camundongos , Motivos de Nucleotídeos , Especificidade de Órgãos/genética , Regiões Promotoras Genéticas , Retroelementos , Transcriptoma , Zigoto/metabolismo
4.
PLoS One ; 9(11): e108567, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25398003

RESUMO

The model plant Arabidopsis has been well-studied using high-throughput genomics technologies, which usually generate lists of differentially expressed genes under various conditions. Our group recently collected 1065 gene lists from 397 gene expression studies as a knowledgebase for pathway analysis. Here we systematically analyzed these gene lists by computing overlaps in all-vs.-all comparisons. We identified 16,261 statistically significant overlaps, represented by an undirected network in which nodes correspond to gene lists and edges indicate significant overlaps. The network highlights the correlation across the gene expression signatures of the diverse biological processes. We also partitioned the main network into 20 sub-networks, representing groups of highly similar expression signatures. These are common sets of genes that were co-regulated under different treatments or conditions and are often related to specific biological themes. Overall, our result suggests that diverse gene expression signatures are highly interconnected in a modular fashion.


Assuntos
Arabidopsis/genética , Fenômenos Biológicos/genética , Perfilação da Expressão Gênica , Regulação da Expressão Gênica de Plantas , Redes Reguladoras de Genes , Genes de Plantas
5.
BMC Bioinformatics ; 14 Suppl 14: S5, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24266891

RESUMO

BACKGROUND: The dynamic, decentralized world-wide-web has become an essential part of scientific research and communication. Researchers create thousands of web sites every year to share software, data and services. These valuable resources tend to disappear over time. The problem has been documented in many subject areas. Our goal is to conduct a cross-disciplinary investigation of the problem and test the effectiveness of existing remedies. RESULTS: We accessed 14,489 unique web pages found in the abstracts within Thomson Reuters' Web of Science citation index that were published between 1996 and 2010 and found that the median lifespan of these web pages was 9.3 years with 62% of them being archived. Survival analysis and logistic regression were used to find significant predictors of URL lifespan. The availability of a web page is most dependent on the time it is published and the top-level domain names. Similar statistical analysis revealed biases in current solutions: the Internet Archive favors web pages with fewer layers in the Universal Resource Locator (URL) while WebCite is significantly influenced by the source of publication. We also created a prototype for a process to submit web pages to the archives and increased coverage of our list of scientific webpages in the Internet Archive and WebCite by 22% and 255%, respectively. CONCLUSION: Our results show that link decay continues to be a problem across different disciplines and that current solutions for static web pages are helping and can be improved.


Assuntos
Bases de Dados Factuais , Internet , Editoração , Arquivos , Bibliografias como Assunto , Humanos , Design de Software , Fatores de Tempo
6.
BMC Genomics ; 14: 243, 2013 Apr 12.
Artigo em Inglês | MEDLINE | ID: mdl-23577827

RESUMO

BACKGROUND: Recent studies had found thousands of natural antisense transcripts originating from the same genomic loci of protein coding genes but from the opposite strand. It is unclear whether the majority of antisense transcripts are functional or merely transcriptional noise. RESULTS: Using the Affymetrix Exon array with a modified cDNA synthesis protocol that enables genome-wide detection of antisense transcription, we conducted large-scale expression analysis of antisense transcripts in nine corresponding tissues from human, mouse and rat. We detected thousands of antisense transcripts, some of which show tissue-specific expression that could be subjected to further study for their potential function in the corresponding tissues/organs. The expression patterns of many antisense transcripts are conserved across species, suggesting selective pressure on these transcripts. When compared to protein-coding genes, antisense transcripts show a lesser degree of expression conservation. We also found a positive correlation between the sense and antisense expression across tissues. CONCLUSION: Our results suggest that natural antisense transcripts are subjected to selective pressure but to a lesser degree compared to sense transcripts in mammals.


Assuntos
RNA Antissenso/genética , Transcrição Gênica , Animais , DNA Complementar/genética , Éxons/genética , Perfilação da Expressão Gênica , Humanos , Camundongos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Especificidade de Órgãos , Ratos , Reação em Cadeia da Polimerase Via Transcriptase Reversa
7.
Dataset Pap Biol ; 20132013.
Artigo em Inglês | MEDLINE | ID: mdl-23457664

RESUMO

Microarrays are a large-scale expression profiling method which has been used to study the transcriptome of plants under various environmental conditions. However, manual inspection of microarray data is difficult at the genome level because of the large number of genes (normally at least 30,000) and the many different processes that occur within any given plant. MapMan software, which was initially developed to visualize microarray data for Arabidopsis, has been adapted to other plant species by mapping other species onto MapMan ontology. This paper provides a detailed procedure and the relevant computing codes to generate a MapMan ontology mapping file for tobacco (Nicotiana tabacum L.) using potato and Arabidopsis as intermediates. The mapping file can be used directly with our custom made NimbleGen oligoarray, that contains gene sequences from both the tobacco gene space sequence and the tobacco gene index 4 (NTGI4) collection of ESTs. The generated data set will be informative for scientists working on tobacco as their model plant by providing a MapMan ontology mapping file to tobacco, homology between tobacco coding sequences and that of potato and Arabidopsis, as well as adapting our procedure and codes for other plant species where the complete genome is not yet available.

8.
Bioinformatics ; 28(17): 2291-2, 2012 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-22760305

RESUMO

UNLABELLED: Studying plants using high-throughput genomics technologies is becoming routine, but interpretation of genome-wide expression data in terms of biological pathways remains a challenge, partly due to the lack of pathway databases. To create a knowledgebase for plant pathway analysis, we collected 1683 lists of differentially expressed genes from 397 gene-expression studies, which constitute a molecular signature database of various genetic and environmental perturbations of Arabidopsis. In addition, we extracted 1909 gene sets from various sources such as Gene Ontology, KEGG, AraCyc, Plant Ontology, predicted target genes of microRNAs and transcription factors, and computational gene clusters defined by meta-analysis. With this knowledgebase, we applied Gene Set Enrichment Analysis to an expression profile of cold acclimation and identified expected functional categories and pathways. Our results suggest that the AraPath database can be used to generate specific, testable hypotheses regarding plant molecular pathways from gene expression data. AVAILABILITY: http://bioinformatics.sdstate.edu/arapath/.


Assuntos
Arabidopsis/genética , Bases de Dados Genéticas , Bases de Conhecimento , Expressão Gênica , Perfilação da Expressão Gênica/métodos , Genoma de Planta , Genômica/métodos , Família Multigênica
9.
BMC Genomics ; 13: 237, 2012 Jun 13.
Artigo em Inglês | MEDLINE | ID: mdl-22694750

RESUMO

BACKGROUND: Many plant genes have been identified through whole genome and deep transcriptome sequencing and other methods; yet our knowledge on the function of many of these genes remains limited. The integration and analysis of large gene-expression datasets gives researchers the ability to formalize hypotheses concerning the functionality and interaction between different groups of correlated genes. RESULTS: We applied the non-negative matrix factorization (NMF) algorithm to the AtGenExpress dataset which consists of 783 microarray samples (29 separate experimental series) conducted on the model plant Arabidopsis thaliana. We identified 15 metagenes, which are groups of genes with correlated expression. Functional roles of these metagenes are established by observing the enriched gene ontology (GO) categories using gene set enrichment analyses (GSEA). Activity levels of these metagenes in various experimental conditions are also analyzed to associate metagenes with stimuli/conditions. A metagene correlation network, constructed based on the results of NMF analysis, revealed many new interactions between the metagenes. Comparison of these metagenes with an earlier large-scale clustering analysis indicates many statistically significant overlaps. CONCLUSIONS: This study identifies a network of correlated metagenes composed of Arabidopsis genes acting in a highly correlated fashion across a broad spectrum of experimental stimuli, which may shed some light on the function of many of the un-annotated genes.


Assuntos
Proteínas de Arabidopsis/genética , Arabidopsis/genética , Regulação da Expressão Gênica de Plantas , Genes de Plantas , Metagenoma , Transcriptoma , Algoritmos , Perfilação da Expressão Gênica , Redes Reguladoras de Genes , Família Multigênica , Análise de Sequência com Séries de Oligonucleotídeos
10.
Comp Funct Genomics ; 2012: 650842, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22474412

RESUMO

Recent advances in microarray technologies have resulted in a flood of genomics data. This large body of accumulated data could be used as a knowledge base to help researchers interpret new experimental data. ArraySearch finds statistical correlations between newly observed gene expression profiles and the huge source of well-characterized expression signatures deposited in the public domain. A search query of a list of genes will return experiments on which the genes are significantly up- or downregulated collectively. Searches can also be conducted using gene expression signatures from new experiments. This resource will empower biological researchers with a statistical method to explore expression data from their own research by comparing it with expression signatures from a large public archive.

11.
BMC Syst Biol ; 5: 87, 2011 May 29.
Artigo em Inglês | MEDLINE | ID: mdl-21619705

RESUMO

BACKGROUND: Cells must respond to various perturbations using their limited available gene repertoires. In order to study how cells coordinate various responses, we conducted a comprehensive comparison of 1,186 gene expression signatures (gene lists) associated with various genetic and chemical perturbations. RESULTS: We identified 7,419 statistically significant overlaps between various published gene lists. Most (80%) of the overlaps can be represented by a highly connected network, a "molecular signature map," that highlights the correlation of various expression signatures. By dissecting this network, we identified sub-networks that define clusters of gene sets related to common biological processes (cell cycle, immune response, etc). Examination of these sub-networks has confirmed relationships among various pathways and also generated new hypotheses. For example, our result suggests that glutamine deficiency might suppress cellular growth by inhibiting the MYC pathway. Interestingly, we also observed 1,369 significant overlaps between a set of genes upregulated by factor X and a set of genes downregulated by factor Y, suggesting a repressive interaction between X and Y factors. CONCLUSIONS: Our results suggest that molecular-level responses to diverse chemical and genetic perturbations are heavily interconnected in a modular fashion. Also, shared molecular pathways can be identified by comparing newly defined gene expression signatures with databases of previously published gene expression signatures.


Assuntos
Perfilação da Expressão Gênica/métodos , Biologia de Sistemas/métodos , Animais , Ciclo Celular , Biologia Computacional/métodos , Bases de Dados Genéticas , Regulação da Expressão Gênica , Glutamina/metabolismo , Humanos , Sistema Imunitário , Camundongos , Modelos Biológicos , Modelos Genéticos , Proteínas Proto-Oncogênicas c-myc/metabolismo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...