Your browser doesn't support javascript.
loading
Mining ORESTES no-match database: can we still contribute to cancer transcriptome?
Fonseca, R. da S; Carraro, D. M; Brentani, H.
Afiliación
  • Fonseca, R. da S; Hospital do Câncer. Laboratório de Bioinformática. São Paulo. BR
  • Carraro, D. M; Instituto Ludwig para Pesquisa sobre o Câncer. Laboratório de Análise de Expressão Gênica. São Paulo. BR
  • Brentani, H; Hospital do Câncer. Laboratório de Bioinformática. São Paulo. BR
Genet. mol. res. (Online) ; 5(1): 24-32, Mar. 31, 2006.
Article en En | LILACS | ID: lil-449149
Biblioteca responsable: BR1.1
ABSTRACT
The Human Cancer Genome Project generated about 1 million expressed sequence tags by the ORESTES method, principally with the aim of obtaining data from cancer. Of this total, 341,680 showed no similarity with sequences in the public transcript databases, referred to as [quot ]no-match[quot ]. Some of them represent low abundance or difficult to detect human transcripts, but part of these sequences represent genomic contamination or immature mRNA. We performed a bioinformatics pipeline to determine the novelty of ORESTES [quot ]no-match[quot ] datasets from prostate or breast tissues. We started with 14,908 clusters mapped on the human genome. A total of 2226 clusters originating from more than two libraries or singletons with gaps upon genome alignment were selected. Ninety-four clusters with canonical splice sites representing the most stringent criteria to be considered a gene were subjected to manual inspection regarding genomic hits. Of the manually inspected clusters, 49.6% contained new sequences where 42.2% were probable low-expression alternative forms of the characterized genes and 7.4% unpredicted genes. RT-PCR followed by sequencing was performed to validate the largest spliced sequence from 8 clusters, resulting in the confirmation of five sequences as true human transcript fragments. Some of them were differentially expressed between tumor and normal tissue by an in silico analysis. We can conclude that after clean up of the no-match dataset, we still have about 939 new exons and 165 unpredicted genes that could complete the prostate or breast transcriptome.
Asunto(s)
Texto completo: 1 Índice: LILACS Asunto principal: Neoplasias de la Próstata / Transcripción Genética / Neoplasias de la Mama / Sistemas de Lectura Abierta / Etiquetas de Secuencia Expresada Tipo de estudio: Prognostic_studies Límite: Female / Humans / Male Idioma: En Revista: Genet. mol. res. (Online) Asunto de la revista: BIOLOGIA MOLECULAR / GENETICA Año: 2006 Tipo del documento: Article
Texto completo: 1 Índice: LILACS Asunto principal: Neoplasias de la Próstata / Transcripción Genética / Neoplasias de la Mama / Sistemas de Lectura Abierta / Etiquetas de Secuencia Expresada Tipo de estudio: Prognostic_studies Límite: Female / Humans / Male Idioma: En Revista: Genet. mol. res. (Online) Asunto de la revista: BIOLOGIA MOLECULAR / GENETICA Año: 2006 Tipo del documento: Article