Mining subspace clusters from DNA microarray data using large itemset techniques.

Chang, Ye-In; Chen, Jiun-Rung; Tsai, Yueh-Chi

Chang, Ye-In; Chen, Jiun-Rung; Tsai, Yueh-Chi.

Afiliación

Chang YI; Department of Computer Science and Engineering, National Sun Yat-Sen University, Taiwan, R.O.C. changyi@cse.nsysu.edu.tw

J Comput Biol ; 16(5): 745-68, 2009 May.

Article en En | MEDLINE | ID: mdl-19432542

RESUMEN

Mining subspace clusters from the DNA microarrays could help researchers identify those genes which commonly contribute to a disease, where a subspace cluster indicates a subset of genes whose expression levels are similar under a subset of conditions. Since in a DNA microarray, the number of genes is far larger than the number of conditions, those previous proposed algorithms which compute the maximum dimension sets (MDSs) for any two genes will take a long time to mine subspace clusters. In this article, we propose the Large Itemset-Based Clustering (LISC) algorithm for mining subspace clusters. Instead of constructing MDSs for any two genes, we construct only MDSs for any two conditions. Then, we transform the task of finding the maximal possible gene sets into the problem of mining large itemsets from the condition-pair MDSs. Since we are only interested in those subspace clusters with gene sets as large as possible, it is desirable to pay attention to those gene sets which have reasonable large support values in the condition-pair MDSs. From our simulation results, we show that the proposed algorithm needs shorter processing time than those previous proposed algorithms which need to construct gene-pair MDSs.

Asunto(s)

Algoritmos; Análisis por Conglomerados; Perfilación de la Expresión Génica/métodos; Análisis de Secuencia por Matrices de Oligonucleótidos/métodos; Biología Computacional/métodos; Bases de Datos Genéticas

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Algoritmos / Análisis por Conglomerados / Análisis de Secuencia por Matrices de Oligonucleótidos / Perfilación de la Expresión Génica Tipo de estudio: Prognostic_studies Idioma: En Revista: J Comput Biol Asunto de la revista: BIOLOGIA MOLECULAR / INFORMATICA MEDICA Año: 2009 Tipo del documento: Article Pais de publicación: Estados Unidos

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google