Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 3 de 3
Filter
Add more filters










Database
Language
Publication year range
2.
Genome Res ; 14(3): 463-71, 2004 Mar.
Article in English | MEDLINE | ID: mdl-14962985

ABSTRACT

A collection of 90,000 human cDNA clones generated to increase the fraction of "full-length" cDNAs available was analyzed by sequence alignment on the human genome assembly. Five hundred fifty-two gene models not found in LocusLink, with coding regions of at least 300 bp, were defined by using this collection. Exon composition proposed for novel genes showed an average of 4.7 exons per gene. In 20% of the cases, at least half of the exons predicted for new genes coincided with evolutionary conserved regions defined by sequence comparisons with the pufferfish Tetraodon nigroviridis. Among this subset, CpG islands were observed at the 5' end of 75%. In-frame stop codons upstream of the initiator ATG were present in 49% of the new genes, and 16% contained a coding region comprising at least 50% of the cDNA sequence. This cDNA resource also provided candidate small protein-coding genes, usually not included in genome annotations. In addition, analysis of a sample from this cDNA collection indicates that approximately 380 gene models described in LocusLink could be extended at their 5' end by at least one new exon. Finally, this cDNA resource provided an experimental support for annotations based exclusively on predictions, thus representing a resource substantially improving the human genome annotation.


Subject(s)
5' Untranslated Regions/genetics , DNA, Complementary/genetics , Genome, Human , Adult , Amino Acid Sequence/genetics , Animals , Cell Line, Tumor , DNA, Complementary/classification , DNA, Neoplasm/classification , DNA, Neoplasm/genetics , HeLa Cells/chemistry , HeLa Cells/metabolism , Humans , Jurkat Cells/chemistry , Jurkat Cells/metabolism , Mice , Models, Genetic , Molecular Sequence Data , Open Reading Frames/genetics , Organ Specificity/genetics , Proteins/chemistry , Proteins/genetics , Sequence Alignment/classification , Sequence Alignment/methods , Sequence Homology, Nucleic Acid , Tetraodontiformes/genetics
3.
Nature ; 421(6923): 601-7, 2003 Feb 06.
Article in English | MEDLINE | ID: mdl-12508121

ABSTRACT

Chromosome 14 is one of five acrocentric chromosomes in the human genome. These chromosomes are characterized by a heterochromatic short arm that contains essentially ribosomal RNA genes, and a euchromatic long arm in which most, if not all, of the protein-coding genes are located. The finished sequence of human chromosome 14 comprises 87,410,661 base pairs, representing 100% of its euchromatic portion, in a single continuous segment covering the entire long arm with no gaps. Two loci of crucial importance for the immune system, as well as more than 60 disease genes, have been localized so far on chromosome 14. We identified 1,050 genes and gene fragments, and 393 pseudogenes. On the basis of comparisons with other vertebrate genomes, we estimate that more than 96% of the chromosome 14 genes have been annotated. From an analysis of the CpG island occurrences, we estimate that 70% of these annotated genes are complete at their 5' end.


Subject(s)
Chromosomes, Human, Pair 14/genetics , Physical Chromosome Mapping , Sequence Analysis, DNA , 5' Untranslated Regions/genetics , Animals , Base Composition , Chromosomes, Artificial/genetics , CpG Islands/genetics , DNA, Mitochondrial/genetics , DNA, Ribosomal/genetics , Genes/genetics , Genomics , Humans , Immunity/genetics , Mice , Microsatellite Repeats/genetics , Molecular Sequence Data , Open Reading Frames/genetics , Pseudogenes/genetics , Reproducibility of Results , Synteny/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...