Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
Add more filters










Database
Language
Publication year range
1.
Proc Natl Acad Sci U S A ; 104(47): 18371-6, 2007 Nov 20.
Article in English | MEDLINE | ID: mdl-18003902

ABSTRACT

We describe the use of a higher-order singular value decomposition (HOSVD) in transforming a data tensor of genes x "x-settings," that is, different settings of the experimental variable x x "y-settings," which tabulates DNA microarray data from different studies, to a "core tensor" of "eigenarrays" x "x-eigengenes" x "y-eigengenes." Reformulating this multilinear HOSVD such that it decomposes the data tensor into a linear superposition of all outer products of an eigenarray, an x- and a y-eigengene, that is, rank-1 "subtensors," we define the significance of each subtensor in terms of the fraction of the overall information in the data tensor that it captures. We illustrate this HOSVD with an integration of genome-scale mRNA expression data from three yeast cell cycle time courses, two of which are under exposure to either hydrogen peroxide or menadione. We find that significant subtensors represent independent biological programs or experimental phenomena. The picture that emerges suggests that the conserved genes YKU70, MRE11, AIF1, and ZWF1, and the processes of retrotransposition, apoptosis, and the oxidative pentose phosphate pathway that these genes are involved in, may play significant, yet previously unrecognized, roles in the differential effects of hydrogen peroxide and menadione on cell cycle progression. A genome-scale correlation between DNA replication initiation and RNA transcription, which is equivalent to a recently discovered correlation and might be due to a previously unknown mechanism of regulation, is independently uncovered.


Subject(s)
DNA/genetics , Oligonucleotide Array Sequence Analysis/methods , Cell Cycle , DNA Replication/genetics , Gene Expression Profiling , Gene Expression Regulation, Fungal , Models, Genetic , Oxidative Stress , RNA, Messenger/genetics , Saccharomyces cerevisiae/cytology , Saccharomyces cerevisiae/genetics , Time Factors
2.
Proc Natl Acad Sci U S A ; 103(32): 11828-33, 2006 Aug 08.
Article in English | MEDLINE | ID: mdl-16877539

ABSTRACT

We describe the singular value decomposition (SVD) of yeast genome-scale mRNA lengths distribution data measured by DNA microarrays. SVD uncovers in the mRNA abundance levels data matrix of genes x arrays, i.e., electrophoretic gel migration lengths or mRNA lengths, mathematically unique decorrelated and decoupled "eigengenes." The eigengenes are the eigenvectors of the arrays x arrays correlation matrix, with the corresponding series of eigenvalues proportional to the series of the "fractions of eigen abundance." Each fraction of eigen abundance indicates the significance of the corresponding eigengene relative to all others. We show that the eigengenes fit "asymmetric Hermite functions," a generalization of the eigenfunctions of the quantum harmonic oscillator and the integral transform which kernel is a generalized coherent state. The fractions of eigen abundance fit a geometric series as do the eigenvalues of the integral transform which kernel is a generalized coherent state. The "asymmetric generalized coherent state" models the measured data, where the profiles of mRNA abundance levels of most genes as well as the distribution of the peaks of these profiles fit asymmetric Gaussians. We hypothesize that the asymmetry in the distribution of the peaks of the profiles is due to two competing evolutionary forces. We show that the asymmetry in the profiles of the genes might be due to a previously unknown asymmetry in the gel electrophoresis thermal broadening of a moving, rather than a stationary, band of RNA molecules.


Subject(s)
Gene Expression Profiling , Genome, Fungal , Oligonucleotide Array Sequence Analysis/methods , RNA, Messenger/metabolism , Evolution, Molecular , Fungal Proteins/chemistry , Genes, Fungal , Models, Theoretical , Normal Distribution , RNA/chemistry , Saccharomyces cerevisiae/genetics , Saccharomyces cerevisiae/metabolism
3.
Proc Natl Acad Sci U S A ; 102(49): 17559-64, 2005 Dec 06.
Article in English | MEDLINE | ID: mdl-16314560

ABSTRACT

We describe the use of the matrix eigenvalue decomposition (EVD) and pseudoinverse projection and a tensor higher-order EVD (HOEVD) in reconstructing the pathways that compose a cellular system from genome-scale nondirectional networks of correlations among the genes of the system. The EVD formulates a genes x genes network as a linear superposition of genes x genes decorrelated and decoupled rank-1 subnetworks, which can be associated with functionally independent pathways. The integrative pseudoinverse projection of a network computed from a "data" signal onto a designated "basis" signal approximates the network as a linear superposition of only the subnetworks that are common to both signals and simulates observation of only the pathways that are manifest in both experiments. We define a comparative HOEVD that formulates a series of networks as linear superpositions of decorrelated rank-1 subnetworks and the rank-2 couplings among these subnetworks, which can be associated with independent pathways and the transitions among them common to all networks in the series or exclusive to a subset of the networks. Boolean functions of the discretized subnetworks and couplings highlight differential, i.e., pathway-dependent, relations among genes. We illustrate the EVD, pseudoinverse projection, and HOEVD of genome-scale networks with analyses of yeast DNA microarray data.


Subject(s)
Computational Biology/methods , Genome, Fungal/genetics , Saccharomyces cerevisiae/cytology , Saccharomyces cerevisiae/genetics , Computer Simulation , Gene Expression Regulation, Fungal , Oligonucleotide Array Sequence Analysis , RNA, Messenger/genetics , RNA, Messenger/metabolism , Signal Transduction
4.
Bioinformatics ; 21(2): 187-98, 2005 Jan 15.
Article in English | MEDLINE | ID: mdl-15333461

ABSTRACT

MOTIVATION: Gene expression data often contain missing expression values. Effective missing value estimation methods are needed since many algorithms for gene expression data analysis require a complete matrix of gene array values. In this paper, imputation methods based on the least squares formulation are proposed to estimate missing values in the gene expression data, which exploit local similarity structures in the data as well as least squares optimization process. RESULTS: The proposed local least squares imputation method (LLSimpute) represents a target gene that has missing values as a linear combination of similar genes. The similar genes are chosen by k-nearest neighbors or k coherent genes that have large absolute values of Pearson correlation coefficients. Non-parametric missing values estimation method of LLSimpute are designed by introducing an automatic k-value estimator. In our experiments, the proposed LLSimpute method shows competitive results when compared with other imputation methods for missing value estimation on various datasets and percentages of missing values in the data. AVAILABILITY: The software is available at http://www.cs.umn.edu/~hskim/tools.html CONTACT: hpark@cs.umn.edu


Subject(s)
Algorithms , Gene Expression Profiling/methods , Models, Genetic , Models, Statistical , Oligonucleotide Array Sequence Analysis/methods , Software , Least-Squares Analysis , Sample Size
5.
Proc Natl Acad Sci U S A ; 101(47): 16577-82, 2004 Nov 23.
Article in English | MEDLINE | ID: mdl-15545604

ABSTRACT

We describe an integrative data-driven mathematical framework that formulates any number of genome-scale molecular biological data sets in terms of one chosen set of data samples, or of profiles extracted mathematically from data samples, designated the "basis" set. By using pseudoinverse projection, the molecular biological profiles of the data samples are least-squares-approximated as superpositions of the basis profiles. Reconstruction of the data in the basis simulates experimental observation of only the cellular states manifest in the data that correspond to those of the basis. Classification of the data samples according to their reconstruction in the basis, rather than their overall measured profiles, maps the cellular states of the data onto those of the basis and gives a global picture of the correlations and possibly also causal coordination of these two sets of states. We illustrate this framework with an integration of yeast genome-scale proteins' DNA-binding data with cell cycle mRNA expression time course data. Novel correlation between DNA replication initiation and RNA transcription during the yeast cell cycle, which might be due to a previously unknown mechanism of regulation, is predicted.


Subject(s)
DNA Replication/genetics , Genome, Fungal , Models, Genetic , RNA, Fungal/genetics , Saccharomyces cerevisiae/genetics , Cell Cycle , DNA, Fungal/biosynthesis , DNA, Fungal/genetics , Databases, Genetic , Least-Squares Analysis , Mathematics , Protein Binding , RNA, Messenger/genetics , Saccharomyces cerevisiae/cytology , Saccharomyces cerevisiae/metabolism , Saccharomyces cerevisiae Proteins/metabolism , Transcription, Genetic
6.
IEEE Trans Med Imaging ; 22(11): 1427-35, 2003 Nov.
Article in English | MEDLINE | ID: mdl-14606676

ABSTRACT

Registration using the least-squares cost function is sensitive to the intensity fluctuations caused by the blood oxygen level dependent (BOLD) signal in functional MRI (fMRI) experiments, resulting in stimulus-correlated motion errors. These errors are severe enough to cause false-positive clusters in the activation maps of datasets acquired from 3T scanners. This paper presents a new approach to resolving the coupling between registration and activation. Instead of treating the two problems as individual steps in a sequence, they are combined into a single least-squares problem and are solved simultaneously. Robustness tests on a variety of simulated three-dimensional EPI datasets show that the stimulus-correlated motion errors are removed, resulting in a substantial decrease in false-positive and false-negative activation rates. The new method is also shown to decorrelate the motion estimates from the stimulus by testing it on different in vivo fMRI datasets acquired from two different 3T scanners.


Subject(s)
Algorithms , Brain Mapping/methods , Image Enhancement/methods , Image Interpretation, Computer-Assisted/methods , Imaging, Three-Dimensional/methods , Magnetic Resonance Imaging/methods , Motion , Subtraction Technique , Artifacts , Brain/anatomy & histology , Brain/physiology , Head Movements , Humans , Oxygen/metabolism , Reproducibility of Results , Sensitivity and Specificity
SELECTION OF CITATIONS
SEARCH DETAIL
...