Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
Add more filters










Database
Language
Publication year range
1.
PLoS One ; 2(11): e1246, 2007 Nov 28.
Article in English | MEDLINE | ID: mdl-18043753

ABSTRACT

An important level at which the expression of programmed cell death (PCD) genes is regulated is alternative splicing. Our previous work identified an intronic splicing regulatory element in caspase-2 (casp-2) gene. This 100-nucleotide intronic element, In100, consists of an upstream region containing a decoy 3' splice site and a downstream region containing binding sites for splicing repressor PTB. Based on the signal of In100 element in casp-2, we have detected the In100-like sequences as a family of sequence elements associated with alternative splicing in the human genome by using computational and experimental approaches. A survey of human genome reveals the presence of more than four thousand In100-like elements in 2757 genes. These In100-like elements tend to locate more frequent in intronic regions than exonic regions. EST analyses indicate that the presence of In100-like elements correlates with the skipping of their immediate upstream exons, with 526 genes showing exon skipping in such a manner. In addition, In100-like elements are found in several human caspase genes near exons encoding the caspase active domain. RT-PCR experiments show that these caspase genes indeed undergo alternative splicing in a pattern predicted to affect their functional activity. Together, these results suggest that the In100-like elements represent a family of intronic signals for alternative splicing in the human genome.


Subject(s)
Alternative Splicing , Genome, Human , Introns , Caspase 2/genetics , Expressed Sequence Tags , Humans , Regulatory Sequences, Nucleic Acid
2.
Bioinformatics ; 22(1): 13-20, 2006 Jan 01.
Article in English | MEDLINE | ID: mdl-16267086

ABSTRACT

MOTIVATION: mRNA sequences and expressed sequence tags represent some of the most abundant experimental data for identifying genes and alternatively spliced products in metazoans. These transcript sequences are frequently studied by aligning them to a genomic sequence template. For existing programs, error-prone, polymorphic and cross-species data, as well as non-canonical splice sites, still present significant barriers to producing accurate, complete alignments. RESULTS: We took a novel approach to spliced alignment that meaningfully combined information from sequence similarity with that obtained from PSSM splice site models. Scoring systems were chosen to maximize their power of discrimination, and dynamic programming (DP) was employed to guarantee optimal solutions would be found. The resultant program, EXALIN, performed better than other popular tools tested under a wide range of conditions that included detection of micro-exons and human-mouse cross-species comparisons. For improved speed with only a marginal decrease in splice site prediction accuracy, EXALIN could perform limited DP guided by a result from BLASTN. AVAILABILITY: The source code, binaries, scripts, scoring matrices and splice site models for human, mouse, rice and Caenorhabditis elegans utilized in this study are posted at http://blast.wustl.edu/exalin. The software (scripts, source code and binaries) is copyrighted but free for all to use.


Subject(s)
Computational Biology/methods , RNA Splicing , Algorithms , Alternative Splicing , Animals , Caenorhabditis elegans , Exons , Expressed Sequence Tags , Humans , Mice , Models, Statistical , Oryza/genetics , Polymorphism, Genetic , Programming Languages , RNA, Messenger/metabolism , Software , Species Specificity
3.
Genome Res ; 14(10B): 2041-7, 2004 Oct.
Article in English | MEDLINE | ID: mdl-15489324

ABSTRACT

Transcription factors (TFs) are essential regulators of gene expression, and mutated TF genes have been shown to cause numerous human genetic diseases. Yet to date, no single, comprehensive database of human TFs exists. In this work, we describe the collection of an essentially complete set of TF genes from one depiction of the human ORFeome, and the design of a microarray to interrogate their expression. Taking 1468 known TFs from TRANSFAC, InterPro, and FlyBase, we used this seed set to search the ScriptSure human transcriptome database for additional genes. ScriptSure's genome-anchored transcript clusters allowed us to work with a nonredundant high-quality representation of the human transcriptome. We used a high-stringency similarity search by using BLASTN, and a protein motif search of the human ORFeome by using hidden Markov models of DNA-binding domains known to occur exclusively or primarily in TFs. Four hundred ninety-four additional TF genes were identified in the overlap between the two searches, bringing our estimate of the total number of human TFs to 1962. Zinc finger genes are by far the most abundant family (762 members), followed by homeobox (199 members) and basic helix-loop-helix genes (117 members). We designed a microarray of 50-mer oligonucleotide probes targeted to a unique region of the coding sequence of each gene. We have successfully used this microarray to interrogate TF gene expression in species as diverse as chickens and mice, as well as in humans.


Subject(s)
Gene Expression Profiling , Genome, Human , Oligonucleotide Array Sequence Analysis , Open Reading Frames/genetics , Transcription Factors/chemistry , Transcription Factors/genetics , Humans , Markov Chains , Transcription Factors/metabolism
4.
Nucleic Acids Res ; 31(13): 3795-8, 2003 Jul 01.
Article in English | MEDLINE | ID: mdl-12824421

ABSTRACT

Since 1995, the WU-BLAST programs (http://blast.wustl.edu) have provided a fast, flexible and reliable method for similarity searching of biological sequence databases. The software is in use at many locales and web sites. The European Bioinformatics Institute's WU-Blast2 (http://www.ebi.ac.uk/blast2/) server has been providing free access to these search services since 1997 and today supports many features that both enhance the usability and expand on the scope of the software.


Subject(s)
Sequence Alignment , Software , Computational Biology , Databases, Nucleic Acid , Databases, Protein , Europe , Internet , User-Computer Interface
5.
Genome Res ; 12(12): 1837-45, 2002 Dec.
Article in English | MEDLINE | ID: mdl-12466287

ABSTRACT

The expressed sequence tag (EST) collection in dbEST provides an extensive resource for detecting alternative splicing on a genomic scale. Using genomically aligned ESTs, a computational tool (TAP) was used to identify alternative splice patterns for 6400 known human genes from the RefSeq database. With sufficient EST coverage, one or more alternatively spliced forms could be detected for nearly all genes examined. To identify high (>95%) confidence observations of alternative splicing, splice variants were clustered on the basis of having mutually exclusive structures, and sample statistics were then applied. Through this selection, alternative splices expected at a frequency of >5% within their respective clusters were seen for only 17%-28% of genes. Although intron retention events (potentially unspliced messages) had been seen for 36% of the genes overall, the same statistical selection yielded reliable cases of intron retention for <5% of genes. For high-confidence alternative splices in the human ESTs, we also noted significantly higher rates both of cross-species conservation in mouse ESTs and of validation in the GenBank mRNA collection. We suggest quantitative analytical approaches such as these can aid in selecting useful targets for further experimental characterization and in so doing may help elucidate the mechanisms and biological implications of alternative splicing.


Subject(s)
Alternative Splicing/genetics , Expressed Sequence Tags , Alternative Splicing/physiology , Animals , Computational Biology/methods , Computational Biology/statistics & numerical data , Conserved Sequence/genetics , Databases, Genetic , Gene Frequency/genetics , Genetic Variation/genetics , Genome , Genome, Human , Humans , Mice , RNA, Messenger/genetics , Sequence Alignment/methods
6.
Nucleic Acids Res ; 30(22): 5004-14, 2002 Nov 15.
Article in English | MEDLINE | ID: mdl-12434005

ABSTRACT

A fundamental problem in the human genome project is uncovering the correct assembly of the human genome. Many studies, including transcriptional analysis, SNP detection and characterization, gene finding and EST clustering, use genome assemblies as templates so it is important to determine the consistency among the various whole genome assemblies. A comparison of the order and orientation of the GenBank entries used to construct the NCBI and UCSC Goldenpath assemblies was made. In addition, a sequence level comparison was performed using MULTI, an efficient database search tool developed to make whole genome comparisons possible. The resulting comparisons show significant discrepancies in the sequence as well as in the order and orientation of GenBank entries used in constructing the NCBI and UCSC assemblies.


Subject(s)
Databases, Nucleic Acid , Genome, Human , Genomics , Chromosomes, Human , Humans , Sequence Analysis, DNA , Sequence Homology, Nucleic Acid
SELECTION OF CITATIONS
SEARCH DETAIL
...