Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 5 de 5
Filter
Add more filters










Database
Language
Publication year range
1.
Genome Res ; 16(1): 55-65, 2006 Jan.
Article in English | MEDLINE | ID: mdl-16344560

ABSTRACT

By analyzing 1,780,295 5'-end sequences of human full-length cDNAs derived from 164 kinds of oligo-cap cDNA libraries, we identified 269,774 independent positions of transcriptional start sites (TSSs) for 14,628 human RefSeq genes. These TSSs were clustered into 30,964 clusters that were separated from each other by more than 500 bp and thus are very likely to constitute mutually distinct alternative promoters. To our surprise, at least 7674 (52%) human RefSeq genes were subject to regulation by putative alternative promoters (PAPs). On average, there were 3.1 PAPs per gene, with the composition of one CpG-island-containing promoter per 2.6 CpG-less promoters. In 17% of the PAP-containing loci, tissue-specific use of the PAPs was observed. The richest tissue sources of the tissue-specific PAPs were testis and brain. It was also intriguing that the PAP-containing promoters were enriched in the genes encoding signal transduction-related proteins and were rarer in the genes encoding extracellular proteins, possibly reflecting the varied functional requirement for and the restricted expression of those categories of genes, respectively. The patterns of the first exons were highly diverse as well. On average, there were 7.7 different splicing types of first exons per locus partly produced by the PAPs, suggesting that a wide variety of transcripts can be achieved by this mechanism. Our findings suggest that use of alternate promoters and consequent alternative use of first exons should play a pivotal role in generating the complexity required for the highly elaborated molecular systems in humans.


Subject(s)
CpG Islands/genetics , Gene Library , Multigene Family/genetics , Promoter Regions, Genetic/genetics , Quantitative Trait Loci/genetics , Transcription, Genetic/genetics , Base Sequence , Exons/genetics , Humans , Molecular Sequence Data , Organ Specificity , Signal Transduction/genetics
2.
DNA Res ; 12(2): 117-26, 2005.
Article in English | MEDLINE | ID: mdl-16303743

ABSTRACT

We have developed an in silico method of selection of human full-length cDNAs encoding secretion or membrane proteins from oligo-capped cDNA libraries. Fullness rates were increased to about 80% by combination of the oligo-capping method and ATGpr, software for prediction of translation start point and the coding potential. Then, using 5'-end single-pass sequences, cDNAs having the signal sequence were selected by PSORT ('signal sequence trap'). We also applied 'secretion or membrane protein-related keyword trap' based on the result of BLAST search against the SWISS-PROT database for the cDNAs which could not be selected by PSORT. Using the above procedures, 789 cDNAs were primarily selected and subjected to full-length sequencing, and 334 of these cDNAs were finally selected as novel. Most of the cDNAs (295 cDNAs: 88.3%) were predicted to encode secretion or membrane proteins. In particular, 165(80.5%) of the 205 cDNAs selected by PSORT were predicted to have signal sequences, while 70 (54.2%) of the 129 cDNAs selected by 'keyword trap' preserved the secretion or membrane protein-related keywords. Many important cDNAs were obtained, including transporters, receptors, and ligands, involved in significant cellular functions. Thus, an efficient method of selecting secretion or membrane protein-encoding cDNAs was developed by combining the above four procedures.


Subject(s)
Gene Library , Membrane Proteins/genetics , Protein Sorting Signals , 5' Flanking Region , Cell Line, Tumor , Cloning, Molecular , Humans , Oligonucleotides/genetics
3.
Nat Genet ; 36(1): 40-5, 2004 Jan.
Article in English | MEDLINE | ID: mdl-14702039

ABSTRACT

As a base for human transcriptome and functional genomics, we created the "full-length long Japan" (FLJ) collection of sequenced human cDNAs. We determined the entire sequence of 21,243 selected clones and found that 14,490 cDNAs (10,897 clusters) were unique to the FLJ collection. About half of them (5,416) seemed to be protein-coding. Of those, 1,999 clusters had not been predicted by computational methods. The distribution of GC content of nonpredicted cDNAs had a peak at approximately 58% compared with a peak at approximately 42%for predicted cDNAs. Thus, there seems to be a slight bias against GC-rich transcripts in current gene prediction procedures. The rest of the cDNAs unique to the FLJ collection (5,481) contained no obvious open reading frames (ORFs) and thus are candidate noncoding RNAs. About one-fourth of them (1,378) showed a clear pattern of splicing. The distribution of GC content of noncoding cDNAs was narrow and had a peak at approximately 42%, relatively low compared with that of protein-coding cDNAs.


Subject(s)
DNA, Complementary , Sequence Analysis, DNA , Chromosomes, Human, 21-22 and Y , Chromosomes, Human, Pair 20 , Computational Biology , Humans , Open Reading Frames , RNA, Messenger
4.
FEBS Lett ; 517(1-3): 121-8, 2002 Apr 24.
Article in English | MEDLINE | ID: mdl-12062421

ABSTRACT

Gene expression of synoviocytes stimulated with tumor necrosis factor-alpha (TNFalpha) was studied by macroarray analysis to elucidate the cellular response and identify new biological functions of known and unknown genes. 10035 cDNA clones were used to make cDNA macroarrays of representative genes. Synoviocytes expressed large amounts of fibronectin and collagen mRNA. Statistical analysis of the macroarray data revealed 26 genes, including six new genes, which underwent significant alteration of gene expression in response to TNFalpha stimulation. These findings suggest that the synoviocyte response to TNFalpha stimulation forms the basis of development of various aspects of the pathophysiology of rheumatoid arthritis.


Subject(s)
Collagen/biosynthesis , Fibronectins/biosynthesis , Gene Expression/drug effects , Synovial Membrane/drug effects , Tumor Necrosis Factor-alpha/pharmacology , Amino Acid Sequence , Arthritis, Rheumatoid/metabolism , Cells, Cultured , Collagen/genetics , Fibronectins/genetics , Humans , Molecular Sequence Data , Oligonucleotide Array Sequence Analysis , RNA, Messenger/biosynthesis , RNA, Messenger/drug effects , Sequence Homology, Amino Acid , Synovial Membrane/pathology , Synovial Membrane/physiopathology
5.
In Silico Biol ; 2(1): 5-18, 2002.
Article in English | MEDLINE | ID: mdl-11808872

ABSTRACT

We have developed an efficient sequence-analysis system and a database system for clones obtained from full-length enriched cDNA libraries made by using the oligo-capping method. We developed a semi-automatic analysis system for 5'- and 3'-end sequences. It pre-processes raw sequences (vector cut and accurate-sequence region extraction), clusters the sequences, searches for similarities through public databases, annotates completeness of clones and analyzes the ORFs in the sequences. Newly developed or improved programs are used in each step. A new program, ESTiMateFull is used to evaluate and to predict the sequence-fullness based on comparisons with mRNA and EST sequences, respectively. The ATGpr program is used to predict sequence-fullness based on statistical information. The combination of full-length enriched cDNA clones and ATGpr fullness prediction resulted in 70% accuracy in the specificity and the sensitivity of the fullness predictions. For the ORFs predicted by the ATGpr, the signal peptides are predicted and a motif search is performed by our new system. We also developed a program that assembles our sequences with dbEST sequences and developed a system to retrieve clones by the characteristics of the ORFs. As keywords, combination of various results of the analyses can be used for retrieval. And various results such as ORF features and database search results can be shown on the same screen by multiple displays. Full-length clones having interesting functions can thus be retrieved efficiently by using this system.


Subject(s)
DNA, Complementary , Databases, Nucleic Acid , Sequence Analysis, DNA/methods , Software , Amino Acid Sequence , Base Sequence , Cloning, Molecular , Expressed Sequence Tags , Gene Library , Image Processing, Computer-Assisted/methods , Molecular Sequence Data
SELECTION OF CITATIONS
SEARCH DETAIL
...