Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 4 de 4
Filter
1.
BMC Evol Biol ; 7 Suppl 1: S15, 2007 Feb 08.
Article in English | MEDLINE | ID: mdl-17288573

ABSTRACT

BACKGROUND: Transcription factors regulate gene expression by interacting with their specific DNA binding sites. Some transcription factors, particularly those involved in transcription initiation, always bind close to transcription start sites (TSS). Others have no such preference and are functional on sites even tens of thousands of base pairs (bp) away from the TSS. The Cyclic-AMP response element (CRE) binding protein (CREB) binds preferentially to a palindromic sequence (TGACGTCA), known as the canonical CRE, and also to other CRE variants. CREB can activate transcription at CREs thousands of bp away from the TSS, but in mammals CREs are found far more frequently within 1 to 150 bp upstream of the TSS than in any other region. This property is termed positional bias. The strength of CREB binding to DNA is dependent on the sequence of the CRE motif. The central CpG dinucleotide in the canonical CRE (TGACGTCA) is critical for strong binding of CREB dimers. Methylation of the cytosine in the CpG can inhibit binding of CREB. Deamination of the methylated cytosines causes a C to T transition, resulting in a functional, but lower affinity CRE variant, TGATGTCA. RESULTS: We performed genome-wide surveys of CREs in a number of species (from worm to human) and showed that only vertebrates exhibited a CRE positional bias. We performed pair-wise comparisons of human CREs with orthologous sequences in mouse, rat and dog genomes and found that canonical and TGATGTCA variant CREs are highly conserved in mammals. However, when orthologous sequences differ, canonical CREs in human are most frequently TGATGTCA in the other species and vice-versa. We have identified 207 human CREs showing such differences. CONCLUSION: Our data suggest that the positional bias of CREs likely evolved after the separation of urochordata and vertebrata. Although many canonical CREs are conserved among mammals, there are a number of orthologous genes that have canonical CREs in one species but the TGATGTCA variant in another. These differences are likely due to deamination of the methylated cytosines in the CpG and may contribute to differential transcriptional regulation among orthologous genes.


Subject(s)
Cyclic AMP Response Element-Binding Protein/metabolism , Evolution, Molecular , Genetic Variation , Response Elements , Animals , Base Sequence , Chromosome Mapping , Consensus Sequence , CpG Islands , DNA Methylation , Genome , Humans , Mammals , Sequence Analysis, DNA
2.
J Bioinform Comput Biol ; 2(4): 639-55, 2004 Dec.
Article in English | MEDLINE | ID: mdl-15617158

ABSTRACT

Various data mining techniques combined with sequence motif information in the promoter region of genes were applied to discover functional genes that are involved in the defense mechanism of systemic acquired resistance (SAR) in Arabidopsis thaliana. A series of K-Means clustering with difference-in-shape as distance measure was initially applied. A stability measure was used to validate this clustering process. A decision tree algorithm with the discover-and-mask technique was used to identify a group of most informative genes. Appearance and abundance of various transcription factor binding sites in the promoter region of the genes were studied. Through the combination of these techniques, we were able to identify 24 candidate genes involved in the SAR defense mechanism. The candidate genes fell into 2 highly resolved categories, each category showing significantly unique profiles of regulatory elements in their promoter regions. This study demonstrates the strength of such integration methods and suggests a broader application of this approach.


Subject(s)
Algorithms , Arabidopsis Proteins/metabolism , Arabidopsis/metabolism , Gene Expression Profiling/methods , Gene Expression Regulation, Plant/physiology , Oligonucleotide Array Sequence Analysis/methods , Salicylic Acid/toxicity , Arabidopsis/drug effects , Arabidopsis/genetics , Arabidopsis Proteins/genetics , Database Management Systems , Databases, Protein , Drug Resistance/physiology , Gene Expression Regulation, Plant/drug effects , Information Storage and Retrieval/methods , Sequence Analysis, DNA/methods
3.
Artif Intell Med ; 31(2): 137-54, 2004 Jun.
Article in English | MEDLINE | ID: mdl-15219291

ABSTRACT

Genome-wide transcription profiling is a powerful technique for studying the enormous complexity of cellular states. Moreover, when applied to disease tissue it may reveal quantitative and qualitative alterations in gene expression that give information on the context or underlying basis for the disease and may provide a new diagnostic approach. However, the data obtained from high-density microarrays is highly complex and poses considerable challenges in data mining. The data requires care in both pre-processing and the application of data mining techniques. This paper addresses the problem of dealing with microarray data that come from two known classes (Alzheimer and normal). We have applied three separate techniques to discover genes associated with Alzheimer disease (AD). The 67 genes identified in this study included a total of 17 genes that are already known to be associated with Alzheimer's or other neurological diseases. This is higher than any of the previously published Alzheimer's studies. Twenty known genes, not previously associated with the disease, have been identified as well as 30 uncharacterized expressed sequence tags (ESTs). Given the success in identifying genes already associated with AD, we can have some confidence in the involvement of the latter genes and ESTs. From these studies we can attempt to define therapeutic strategies that would prevent the loss of specific components of neuronal function in susceptible patients or be in a position to stimulate the replacement of lost cellular function in damaged neurons. Although our study is based on a relatively small number of patients (four AD and five normal), we think our approach sets the stage for a major step in using gene expression data for disease modeling (i.e. classification and diagnosis). It can also contribute to the future of gene function identification, pathology, toxicogenomics, and pharmacogenomics.


Subject(s)
Alzheimer Disease/genetics , Alzheimer Disease/physiopathology , Gene Expression Profiling , Genetic Predisposition to Disease , Oligonucleotide Array Sequence Analysis/statistics & numerical data , Databases, Genetic , Expressed Sequence Tags , Humans , Information Storage and Retrieval , Neurons/pathology , Neurons/physiology
4.
Bioinformatics ; 20(10): 1535-45, 2004 Jul 10.
Article in English | MEDLINE | ID: mdl-14962920

ABSTRACT

MOTIVATION: A measurement of cluster quality is needed to choose potential clusters of genes that contain biologically relevant patterns of gene expression. This is strongly desirable when a large number of gene expression profiles have to be analyzed and proper clusters of genes need to be identified for further analysis, such as the search for meaningful patterns, identification of gene functions or gene response analysis. RESULTS: We propose a new cluster quality method, called stability, by which unsupervised learning of gene expression data can be performed efficiently. The method takes into account a cluster's stability on partition. We evaluate this method and demonstrate its performance using four independent, real gene expression and three simulated datasets. We demonstrate that our method outperforms other techniques listed in the literature. The method has applications in evaluating clustering validity as well as identifying stable clusters. AVAILABILITY: Please contact the first author.


Subject(s)
Algorithms , Cluster Analysis , Gene Expression Profiling/methods , Sequence Alignment/methods , Sequence Analysis, DNA/methods , Genetic Variation , Genome , Genomic Instability/genetics , Hepacivirus/genetics , Leukemia/genetics , Models, Genetic , Models, Statistical , Pattern Recognition, Automated/methods , Yeasts/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...