ABSTRACT
Powerful specialized software is essential for managing, quantifying, and ultimately deriving scientific insight from results of a microarray experiment. We have developed a suite of software applications, known as TM4, to support such gene expression studies. The suite consists of open-source tools for data management and reporting, image analysis, normalization and pipeline control, and data mining and visualization. An integrated MIAME-compliant MySQL database is included. This chapter describes each component of the suite and includes a sample analysis walk-through.
Subject(s)
Oligonucleotide Array Sequence Analysis/methods , Software , Algorithms , Animals , Gene Expression Profiling/methods , Gene Expression Profiling/statistics & numerical data , Humans , Oligonucleotide Array Sequence Analysis/statistics & numerical dataABSTRACT
BACKGROUND: DNA microarray assays typically compare two biological samples and present the results of those comparisons gene-by-gene as the logarithm base two of the ratio of the measured expression levels for the two samples. RESULTS: Because of the fixed dynamic range of fluorescence and other detection systems, there is a limit to the range of comparisons that can be made using any array technology, and this must be taken into account when interpreting the results of any such analysis. CONCLUSIONS: The dynamic range of microarray data collection systems results in limits in the comparative analyses that can be derived from such measurements and suggests that optimal results can be obtained by making measurements that avoid the boundaries of that dynamic range.
Subject(s)
Gene Expression Profiling/standards , Oligonucleotide Array Sequence Analysis/standards , Algorithms , Gene Expression Profiling/methods , Nucleic Acid Hybridization/methods , Oligonucleotide Array Sequence Analysis/methods , Reproducibility of Results , SoftwareABSTRACT
BACKGROUND: 'Fold-change' cutoffs have been widely used in microarray assays to identify genes that are differentially expressed between query and reference samples. More accurate measures of differential expression and effective data-normalization strategies are required to identify high-confidence sets of genes with biologically meaningful changes in transcription. Further, the analysis of a large number of expression profiles is facilitated by a common reference sample, the construction of which must be carefully addressed. RESULTS: We carried out a series of 'self-self' hybridizations in which aliquots of the same RNA sample were labeled separately with Cy3 and Cy5 fluorescent dyes and co-hybridized to the same microarray. From this, we can analyze the intensity-dependent behavior of microarray data, define a statistically significant measure of differential expression that exploits the structure of the fluorescent signals, and measure the inherent reproducibility of the technique. We also devised a simple procedure for identifying and eliminating low-quality data for replicates within and between slides. We examine the properties required of a universal reference RNA sample and show how pooling a small number of samples with a diverse representation of expressed genes can outperform more complex mixtures as a reference sample. CONCLUSION: Analysis of cell-line samples can identify systematic structure in measured gene-expression levels. A general procedure for analyzing cDNA microarray data is proposed and validated. We show that pooled reference samples should be based not only on the expression of individual genes in each cell line but also on the expression levels of genes within cell lines.