Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 10 de 10
Filter
Add more filters










Publication year range
1.
BioData Min ; 7: 18, 2014.
Article in English | MEDLINE | ID: mdl-25228922

ABSTRACT

Reference datasets are often used to compare, interpret or validate experimental data and analytical methods. In the field of gene expression, several reference datasets have been published. Typically, they consist of individual baseline or spike-in experiments carried out in a single laboratory and representing a particular set of conditions. Here, we describe a new type of standardized datasets representative for the spatial and temporal dimensions of gene expression. They result from integrating expression data from a large number of globally normalized and quality controlled public experiments. Expression data is aggregated by anatomical part or stage of development to yield a representative transcriptome for each category. For example, we created a genome-wide expression dataset representing the FDA tissue panel across 35 tissue types. The proposed datasets were created for human and several model organisms and are publicly available at http://www.expressiondata.org.

2.
BMC Genomics ; 14: 716, 2013 Oct 20.
Article in English | MEDLINE | ID: mdl-24138449

ABSTRACT

BACKGROUND: Predicting molecular responses in human by extrapolating results from model organisms requires a precise understanding of the architecture and regulation of biological mechanisms across species. RESULTS: Here, we present a large-scale comparative analysis of organ and tissue transcriptomes involving the three mammalian species human, mouse and rat. To this end, we created a unique, highly standardized compendium of tissue expression. Representative tissue specific datasets were aggregated from more than 33,900 Affymetrix expression microarrays. For each organism, we created two expression datasets covering over 55 distinct tissue types with curated data from two independent microarray platforms. Principal component analysis (PCA) revealed that the tissue-specific architecture of transcriptomes is highly conserved between human, mouse and rat. Moreover, tissues with related biological function clustered tightly together, even if the underlying data originated from different labs and experimental settings. Overall, the expression variance caused by tissue type was approximately 10 times higher than the variance caused by perturbations or diseases, except for a subset of cancers and chemicals. Pairs of gene orthologs exhibited higher expression correlation between mouse and rat than with human. Finally, we show evidence that tissue expression profiles, if combined with sequence similarity, can improve the correct assignment of functionally related homologs across species. CONCLUSION: The results demonstrate that tissue-specific regulation is the main determinant of transcriptome composition and is highly conserved across mammalian species.


Subject(s)
Transcriptome , Animals , Cluster Analysis , Genome , Genome, Human , Humans , Mice , Multigene Family , Oligonucleotide Array Sequence Analysis , Principal Component Analysis , Rats , Species Specificity
3.
Adv Bioinformatics ; 2013: 920325, 2013.
Article in English | MEDLINE | ID: mdl-23864855

ABSTRACT

Visualization of large complex networks has become an indispensable part of systems biology, where organisms need to be considered as one complex system. The visualization of the corresponding network is challenging due to the size and density of edges. In many cases, the use of standard visualization algorithms can lead to high running times and poorly readable visualizations due to many edge crossings. We suggest an approach that analyzes the structure of the graph first and then generates a new graph which contains specific semantic symbols for regular substructures like dense clusters. We propose a multilevel gamma-clustering layout visualization algorithm (MLGA) which proceeds in three subsequent steps: (i) a multilevel γ -clustering is used to identify the structure of the underlying network, (ii) the network is transformed to a tree, and (iii) finally, the resulting tree which shows the network structure is drawn using a variation of a force-directed algorithm. The algorithm has a potential to visualize very large networks because it uses modern clustering heuristics which are optimized for large graphs. Moreover, most of the edges are removed from the visual representation which allows keeping the overview over complex graphs with dense subgraphs.

4.
BMC Genomics ; 12: 156, 2011 Mar 21.
Article in English | MEDLINE | ID: mdl-21418615

ABSTRACT

BACKGROUND: RT-qPCR is a sensitive and increasingly used method for gene expression quantification. To normalize RT-qPCR measurements between samples, most laboratories use endogenous reference genes as internal controls. There is increasing evidence, however, that the expression of commonly used reference genes can vary significantly in certain contexts. RESULTS: Using the Genevestigator database of normalized and well-annotated microarray experiments, we describe the expression stability characteristics of the transciptomes of several organisms. The results show that a) no genes are universally stable, b) most commonly used reference genes yield very high transcript abundances as compared to the entire transcriptome, and c) for each biological context a subset of stable genes exists that has smaller variance than commonly used reference genes or genes that were selected for their stability across all conditions. CONCLUSION: We therefore propose the normalization of RT-qPCR data using reference genes that are specifically chosen for the conditions under study. RefGenes is a community tool developed for that purpose. Validation RT-qPCR experiments across several organisms showed that the candidates proposed by RefGenes generally outperformed commonly used reference genes. RefGenes is available within Genevestigator at http://www.genevestigator.com.


Subject(s)
Gene Expression Profiling/methods , Reverse Transcriptase Polymerase Chain Reaction/standards , Software , Algorithms , Animals , Arabidopsis/genetics , Cattle , Computational Biology/methods , Databases, Genetic , Female , Gene Expression Profiling/standards , Humans , Mice , Oligonucleotide Array Sequence Analysis , Reference Standards , Reverse Transcriptase Polymerase Chain Reaction/methods , Swine , User-Computer Interface
5.
Mol Plant ; 1(5): 851-7, 2008 Sep.
Article in English | MEDLINE | ID: mdl-19825587

ABSTRACT

The wide-spread use of microarray technologies to study plant transcriptomes has led to important discoveries and to an accumulation of profiling data covering a wide range of different tissues, developmental stages, perturbations, and genotypes. Querying a large number of microarray experiments can provide insights that cannot be gained by analyzing single experiments. However, such a meta-analysis poses significant challenges with respect to data comparability and normalization, systematic sample annotation, and analysis tools. Genevestigator addresses these issues using a large curated expression database and a set of specifically developed analysis tools that are accessible over the internet. This combination has already proven to be useful in the area of plant research based on a large set of Arabidopsis data (Grennan, 2006). Here, we present the release of the Genevestigator rice and barley gene expression databases that contain quality-controlled and well annotated microarray experiments using ontologies. The databases currently comprise experiments from pathology, plant nutrition, abiotic stress, hormone treatment, genotype, and spatial or temporal analysis, but are expected to cover a broad variety of research areas as more experimental data become available. The transcriptome meta-analysis of the model species rice and barley is expected to deliver results that can be used for functional genomics and biotechnological applications in cereals.


Subject(s)
Biomarkers/analysis , Computational Biology/methods , Databases, Genetic , Gene Expression Profiling , Hordeum/genetics , Oryza/genetics , Software , Arabidopsis/genetics , Gene Expression Regulation, Plant , Oligonucleotide Array Sequence Analysis
6.
Adv Bioinformatics ; 2008: 420747, 2008.
Article in English | MEDLINE | ID: mdl-19956698

ABSTRACT

The Web-based software tool Genevestigator provides powerful tools for biologists to explore gene expression across a wide variety of biological contexts. Its first releases, however, were limited by the scaling ability of the system architecture, multiorganism data storage and analysis capability, and availability of computationally intensive analysis methods. Genevestigator V3 is a novel meta-analysis system resulting from new algorithmic and software development using a client/server architecture, large-scale manual curation and quality control of microarray data for several organisms, and curation of pathway data for mouse and Arabidopsis. In addition to improved querying features, Genevestigator V3 provides new tools to analyze the expression of genes in many different contexts, to identify biomarker genes, to cluster genes into expression modules, and to model expression responses in the context of metabolic and regulatory networks. Being a reference expression database with user-friendly tools, Genevestigator V3 facilitates discovery research and hypothesis validation.

7.
EXS ; 97: 331-51, 2007.
Article in English | MEDLINE | ID: mdl-17432274

ABSTRACT

A central goal of postgenomic research is to assign a function to every predicted gene. Because genes often cooperate in order to establish and regulate cellular events the examination of a gene has also included the search for at least a few interacting genes. This requires a strong hypothesis about possible interaction partners, which has often been derived from what was known about the gene or protein beforehand. Many times, though, this prior knowledge has either been completely lacking, biased towards favored concepts, or only partial due to the theoretically vast interaction space. With the advent of high-throughput technology and robotics in biological research, it has become possible to study gene function on a global scale, monitoring entire genomes and proteomes at once. These systematic approaches aim at considering all possible dependencies between genes or their products, thereby exploring the interaction space at a systems scale. This chapter provides an introduction to network analysis and illustrates the corresponding concepts on the basis of gene expression data. First, an overview of existing methods for the identification of co-regulated genes is given. Second, the issue of topology inference is discussed and as an example a specific inference method is presented. And lastly, the application of these techniques is demonstrated for the Arabidopsis thaliana isoprenoid pathway.


Subject(s)
Gene Regulatory Networks , Arabidopsis/metabolism , Cluster Analysis , Terpenes/metabolism , Yeasts/metabolism
8.
Bioinformatics ; 22(10): 1282-3, 2006 May 15.
Article in English | MEDLINE | ID: mdl-16551664

ABSTRACT

SUMMARY: Besides classical clustering methods such as hierarchical clustering, in recent years biclustering has become a popular approach to analyze biological data sets, e.g. gene expression data. The Biclustering Analysis Toolbox (BicAT) is a software platform for clustering-based data analysis that integrates various biclustering and clustering techniques in terms of a common graphical user interface. Furthermore, BicAT provides different facilities for data preparation, inspection and postprocessing such as discretization, filtering of biclusters according to specific criteria or gene pair analysis for constructing gene interconnection graphs. The possibility to use different biclustering algorithms inside a single graphical tool allows the user to compare clustering results and choose the algorithm that best fits a specific biological scenario. The toolbox is described in the context of gene expression analysis, but is also applicable to other types of data, e.g. data from proteomics or synthetic lethal experiments. AVAILABILITY: The BicAT toolbox is freely available at http://www.tik.ee.ethz.ch/sop/bicat and runs on all operating systems. The Java source code of the program and a developer's guide is provided on the website as well. Therefore, users may modify the program and add further algorithms or extensions.


Subject(s)
Cluster Analysis , Gene Expression Profiling/methods , Information Storage and Retrieval/methods , Oligonucleotide Array Sequence Analysis/methods , Software , User-Computer Interface , Algorithms , Artificial Intelligence , Database Management Systems , Databases, Protein , Pattern Recognition, Automated/methods
9.
Bioinformatics ; 22(9): 1122-9, 2006 May 01.
Article in English | MEDLINE | ID: mdl-16500941

ABSTRACT

MOTIVATION: In recent years, there have been various efforts to overcome the limitations of standard clustering approaches for the analysis of gene expression data by grouping genes and samples simultaneously. The underlying concept, which is often referred to as biclustering, allows to identify sets of genes sharing compatible expression patterns across subsets of samples, and its usefulness has been demonstrated for different organisms and datasets. Several biclustering methods have been proposed in the literature; however, it is not clear how the different techniques compare with each other with respect to the biological relevance of the clusters as well as with other characteristics such as robustness and sensitivity to noise. Accordingly, no guidelines concerning the choice of the biclustering method are currently available. RESULTS: First, this paper provides a methodology for comparing and validating biclustering methods that includes a simple binary reference model. Although this model captures the essential features of most biclustering approaches, it is still simple enough to exactly determine all optimal groupings; to this end, we propose a fast divide-and-conquer algorithm (Bimax). Second, we evaluate the performance of five salient biclustering algorithms together with the reference model and a hierarchical clustering method on various synthetic and real datasets for Saccharomyces cerevisiae and Arabidopsis thaliana. The comparison reveals that (1) biclustering in general has advantages over a conventional hierarchical clustering approach, (2) there are considerable performance differences between the tested methods and (3) already the simple reference model delivers relevant patterns within all considered settings.


Subject(s)
Algorithms , Artificial Intelligence , Cluster Analysis , Databases, Genetic , Gene Expression Profiling/methods , Gene Expression/physiology , Oligonucleotide Array Sequence Analysis/methods , Pattern Recognition, Automated/methods
10.
Genome Biol ; 5(11): R92, 2004.
Article in English | MEDLINE | ID: mdl-15535868

ABSTRACT

We present a novel graphical Gaussian modeling approach for reverse engineering of genetic regulatory networks with many genes and few observations. When applying our approach to infer a gene network for isoprenoid biosynthesis in Arabidopsis thaliana, we detect modules of closely connected genes and candidate genes for possible cross-talk between the isoprenoid pathways. Genes of downstream pathways also fit well into the network. We evaluate our approach in a simulation study and using the yeast galactose network.


Subject(s)
Arabidopsis/genetics , Computer Graphics/statistics & numerical data , Genes, Plant/genetics , Models, Genetic , Terpenes/metabolism , Computer Simulation/statistics & numerical data , Galactose/metabolism , Genes, Fungal/genetics , Genes, Plant/physiology , Normal Distribution , Saccharomyces cerevisiae/genetics , Saccharomyces cerevisiae/metabolism
SELECTION OF CITATIONS
SEARCH DETAIL
...