RESUMEN
Whereas genome sequencing defines the genetic potential of an organism, transcript sequencing defines the utilization of this potential and links the genome with most areas of biology. To exploit the information within the human genome in the fight against cancer, we have deposited some two million expressed sequence tags (ESTs) from human tumors and their corresponding normal tissues in the public databases. The data currently define approximately 23,500 genes, of which only approximately 1,250 are still represented only by ESTs. Examination of the EST coverage of known cancer-related (CR) genes reveals that <1% do not have corresponding ESTs, indicating that the representation of genes associated with commonly studied tumors is high. The careful recording of the origin of all ESTs we have produced has enabled detailed definition of where the genes they represent are expressed in the human body. More than 100,000 ESTs are available for seven tissues, indicating a surprising variability of gene usage that has led to the discovery of a significant number of genes with restricted expression, and that may thus be therapeutically useful. The ESTs also reveal novel nonsynonymous germline variants (although the one-pass nature of the data necessitates careful validation) and many alternatively spliced transcripts. Although widely exploited by the scientific community, vindicating our totally open source policy, the EST data generated still provide extensive information that remains to be systematically explored, and that may further facilitate progress toward both the understanding and treatment of human cancers.
Asunto(s)
Etiquetas de Secuencia Expresada , Regulación Neoplásica de la Expresión Génica , Neoplasias/genética , Proteoma , ARN Mensajero/metabolismo , Mapeo Cromosómico , Bases de Datos Genéticas , Variación Genética , Humanos , Neoplasias/metabolismo , Polimorfismo de Nucleótido Simple , Distribución TisularRESUMEN
We applied a systematic bioinformatics approach, followed by careful manual inspection and experimental validation to identify additional expressed sequences located at the Hereditary Prostate Cancer Region (HPC1) between D1S2818 and D1S1642 on chromosome 1q25. All transcripts already described for the 1q25 region were identified and we were able to define 11 additional expressed sequences within this region (three full-length cDNA clone sequences and eight ESTs), increasing the total number of gene count in this region by 38%. Five out of the 11 expressed sequences identified were shown to be expressed in prostate tissue and thus represent novel disease gene candidates for the HPC1 region. Here, we report a detailed characterization of these five novel disease gene candidates, their expression pattern in various tissues, their genomic organization and functional annotation. Two candidates (RGSL1 and RGSL2) correspond to novel members of the RGS family, which is involved in the regulation of G-protein signaling. RGSL1 and RGLS2 expression was detected by real-time polymerase chain reaction in normal prostate tissue, but could not be detected in prostate tumor cell lines, suggesting they might have a role in prostate cancer.