Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
Add more filters










Database
Language
Publication year range
1.
Plant J ; 52(3): 561-9, 2007 Nov.
Article in English | MEDLINE | ID: mdl-17680783

ABSTRACT

Accurately identifying differentially expressed genes from microarray data is not a trivial task, partly because of poor variance estimates of gene expression signals. Here, after analyzing 380 replicated microarray experiments, we found that probesets have typical, distinct variances that can be estimated based on a large number of microarray experiments. These probeset-specific variances depend at least in part on the function of the probed gene: genes for ribosomal or structural proteins often have a small variance, while genes implicated in stress responses often have large variances. We used these variance estimates to develop a statistical test for differentially expressed genes called EVE (external variance estimation). The EVE algorithm performs better than the t-test and LIMMA on some real-world data, where external information from appropriate databases is available. Thus, EVE helps to maximize the information gained from a typical microarray experiment. Nonetheless, only a large number of replicates will guarantee to identify nearly all truly differentially expressed genes. However, our simulation studies suggest that even limited numbers of replicates will usually result in good coverage of strongly differentially expressed genes.


Subject(s)
Arabidopsis/genetics , Gene Expression Profiling/methods , Oligonucleotide Array Sequence Analysis/methods , Algorithms , Data Interpretation, Statistical
2.
EXS ; 97: 331-51, 2007.
Article in English | MEDLINE | ID: mdl-17432274

ABSTRACT

A central goal of postgenomic research is to assign a function to every predicted gene. Because genes often cooperate in order to establish and regulate cellular events the examination of a gene has also included the search for at least a few interacting genes. This requires a strong hypothesis about possible interaction partners, which has often been derived from what was known about the gene or protein beforehand. Many times, though, this prior knowledge has either been completely lacking, biased towards favored concepts, or only partial due to the theoretically vast interaction space. With the advent of high-throughput technology and robotics in biological research, it has become possible to study gene function on a global scale, monitoring entire genomes and proteomes at once. These systematic approaches aim at considering all possible dependencies between genes or their products, thereby exploring the interaction space at a systems scale. This chapter provides an introduction to network analysis and illustrates the corresponding concepts on the basis of gene expression data. First, an overview of existing methods for the identification of co-regulated genes is given. Second, the issue of topology inference is discussed and as an example a specific inference method is presented. And lastly, the application of these techniques is demonstrated for the Arabidopsis thaliana isoprenoid pathway.


Subject(s)
Gene Regulatory Networks , Arabidopsis/metabolism , Cluster Analysis , Terpenes/metabolism , Yeasts/metabolism
3.
Stat Appl Genet Mol Biol ; 5: Article1, 2006.
Article in English | MEDLINE | ID: mdl-16646863

ABSTRACT

As a powerful tool for analyzing full conditional (in-)dependencies between random variables, graphical models have become increasingly popular to infer genetic networks based on gene expression data. However, full (unconstrained) conditional relationships between random variables can be only estimated accurately if the number of observations is relatively large in comparison to the number of variables, which is usually not fulfilled for high-throughput genomic data. Recently, simplified graphical modeling approaches have been proposed to determine dependencies between gene expression profiles. For sparse graphical models such as genetic networks, it is assumed that the zero- and first-order conditional independencies still reflect reasonably well the full conditional independence structure between variables. Moreover, low-order conditional independencies have the advantage that they can be accurately estimated even when having only a small number of observations. Therefore, using only zero- and first-order conditional dependencies to infer the complete graphical model can be very useful. Here, we analyze the statistical and probabilistic properties of these low-order conditional independence graphs (called 0-1 graphs). We find that for faithful graphical models, the 0-1 graph contains at least all edges of the full conditional independence graph (concentration graph). For simple structures such as Markov trees, the 0-1 graph even coincides with the concentration graph. Furthermore, we present some asymptotic results and we demonstrate in a simulation study that despite their simplicity, 0-1 graphs are generally good estimators of sparse graphical models. Finally, the biological relevance of some applications is summarized.


Subject(s)
Gene Expression Regulation , Models, Statistical , Algorithms , Gene Expression Profiling , Models, Biological , Normal Distribution , Oligonucleotide Array Sequence Analysis , Probability
4.
Bioinformatics ; 22(9): 1122-9, 2006 May 01.
Article in English | MEDLINE | ID: mdl-16500941

ABSTRACT

MOTIVATION: In recent years, there have been various efforts to overcome the limitations of standard clustering approaches for the analysis of gene expression data by grouping genes and samples simultaneously. The underlying concept, which is often referred to as biclustering, allows to identify sets of genes sharing compatible expression patterns across subsets of samples, and its usefulness has been demonstrated for different organisms and datasets. Several biclustering methods have been proposed in the literature; however, it is not clear how the different techniques compare with each other with respect to the biological relevance of the clusters as well as with other characteristics such as robustness and sensitivity to noise. Accordingly, no guidelines concerning the choice of the biclustering method are currently available. RESULTS: First, this paper provides a methodology for comparing and validating biclustering methods that includes a simple binary reference model. Although this model captures the essential features of most biclustering approaches, it is still simple enough to exactly determine all optimal groupings; to this end, we propose a fast divide-and-conquer algorithm (Bimax). Second, we evaluate the performance of five salient biclustering algorithms together with the reference model and a hierarchical clustering method on various synthetic and real datasets for Saccharomyces cerevisiae and Arabidopsis thaliana. The comparison reveals that (1) biclustering in general has advantages over a conventional hierarchical clustering approach, (2) there are considerable performance differences between the tested methods and (3) already the simple reference model delivers relevant patterns within all considered settings.


Subject(s)
Algorithms , Artificial Intelligence , Cluster Analysis , Databases, Genetic , Gene Expression Profiling/methods , Gene Expression/physiology , Oligonucleotide Array Sequence Analysis/methods , Pattern Recognition, Automated/methods
5.
Genome Biol ; 5(11): R92, 2004.
Article in English | MEDLINE | ID: mdl-15535868

ABSTRACT

We present a novel graphical Gaussian modeling approach for reverse engineering of genetic regulatory networks with many genes and few observations. When applying our approach to infer a gene network for isoprenoid biosynthesis in Arabidopsis thaliana, we detect modules of closely connected genes and candidate genes for possible cross-talk between the isoprenoid pathways. Genes of downstream pathways also fit well into the network. We evaluate our approach in a simulation study and using the yeast galactose network.


Subject(s)
Arabidopsis/genetics , Computer Graphics/statistics & numerical data , Genes, Plant/genetics , Models, Genetic , Terpenes/metabolism , Computer Simulation/statistics & numerical data , Galactose/metabolism , Genes, Fungal/genetics , Genes, Plant/physiology , Normal Distribution , Saccharomyces cerevisiae/genetics , Saccharomyces cerevisiae/metabolism
6.
Genet Epidemiol ; 25(4): 350-9, 2003 Dec.
Article in English | MEDLINE | ID: mdl-14639704

ABSTRACT

In complex traits, multiple disease loci presumably interact to produce the disease. For this reason, even with high-resolution single nucleotide polymorphism (SNP) marker maps, it has been difficult to map susceptibility loci by conventional locus-by-locus methods. Fine mapping strategies are needed that allow for the simultaneous detection of interacting disease loci while handling large numbers of densely spaced markers. For this purpose, sum statistics were recently proposed as a first-stage analysis method for case-control association studies with SNPs. Via sums of single-marker statistics, information over multiple disease-associated markers is combined and, with a global significance value alpha, a small set of "interesting" markers is selected for further analysis. Here, the statistical properties of such approaches are examined by computer simulation. It is shown that sum statistics can often be successfully applied when marker-by-marker approaches fail to detect association. Compared with Bonferroni or False Discovery Rate (FDR) procedures, sum statistics have greater power, and more disease loci can be detected. However, in studies with tightly linked markers, simple sum statistics can be suboptimal, since the intermarker correlation is ignored. A method is presented that takes the correlation structure among marker loci into account when marker statistics are combined.


Subject(s)
Genetic Predisposition to Disease/genetics , Models, Genetic , Polymorphism, Single Nucleotide/genetics , Algorithms , Case-Control Studies , Genetic Linkage , Genetic Markers , Humans , Models, Statistical
SELECTION OF CITATIONS
SEARCH DETAIL
...