Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 22
Filter
Add more filters










Publication year range
1.
Epigenomes ; 7(1)2023 Mar 20.
Article in English | MEDLINE | ID: mdl-36975604

ABSTRACT

Epigenomic changes in the venous cells exerted by oscillatory shear stress towards the endothelium may result in consolidation of gene expression alterations upon vein wall remodeling during varicose transformation. We aimed to reveal such epigenome-wide methylation changes. Primary culture cells were obtained from non-varicose vein segments left after surgery of 3 patients by growing the cells in selective media after magnetic immunosorting. Endothelial cells were either exposed to oscillatory shear stress or left at the static condition. Then, other cell types were treated with preconditioned media from the adjacent layer's cells. DNA isolated from the harvested cells was subjected to epigenome-wide study using Illumina microarrays followed by data analysis with GenomeStudio (Illumina), Excel (Microsoft), and Genome Enhancer (geneXplain) software packages. Differential (hypo-/hyper-) methylation was revealed for each cell layer's DNA. The most targetable master regulators controlling the activity of certain transcription factors regulating the genes near the differentially methylated sites appeared to be the following: (1) HGS, PDGFB, and AR for endothelial cells; (2) HGS, CDH2, SPRY2, SMAD2, ZFYVE9, and P2RY1 for smooth muscle cells; and (3) WWOX, F8, IGF2R, NFKB1, RELA, SOCS1, and FXN for fibroblasts. Some of the identified master regulators may serve as promising druggable targets for treating varicose veins in the future.

2.
Oncotarget ; 10(51): 5267-5297, 2019 Sep 03.
Article in English | MEDLINE | ID: mdl-31523389

ABSTRACT

Semisynthetic triterpenoids, bearing cyano enone functionality in ring A, are considered now as novel promising anti-tumor agents. However, despite the large-scale studies, their effects on cervical carcinoma cells and, moreover, mechanisms underlying cell death activation by such compounds in this cell type have not been fully elucidated. In this work, we attempted to reconstitute the key pathways and master regulators involved in the response of human cervical carcinoma KB-3-1 cells to the novel glycyrrhetinic acid derivative soloxolone methyl (SM) by a transcriptomic approach. Functional annotation of differentially expressed genes, analysis of their cis- regulatory sequences and protein-protein interaction network clearly indicated that stress of endoplasmic reticulum (ER) is the central event triggered by SM in the cells. A range of key ER stress sensors and transcription factor AP-1 were identified as upstream transcriptional regulators, controlling the response of the cells to SM. Additionally, by using Gene Expression Omnibus data, we showed the ability of SM to modulate the expression of key genes involved in regulation of the high proliferative rate of cervical carcinoma cells. Further Connectivity Map analysis revealed similarity of SM's effects with known ER stress inducers thapsigargin and geldanamycin, targeting SERCA and Grp94, respectively. According to the molecular docking study, SM could snugly fit into the active sites of these proteins in the positions very close to that of both inhibitors. Taken together, our findings provide a basis for the better understanding of the intracellular processes in tumor cells switched on in response to cyano enone-bearing triterpenoids.

3.
Epigenomics ; 10(8): 1103-1119, 2018 08.
Article in English | MEDLINE | ID: mdl-30070582

ABSTRACT

AIM: To integrate transcriptomic and DNA-methylomic measurements on varicose versus normal veins using a systems biological analysis to shed light on the interplay between genetic and epigenetic factors. MATERIALS & METHODS: Differential expression and methylation were measured using microarrays, supported by real-time quantitative PCR and immunohistochemistry confirmation for relevant gene products. A systems biological 'upstream analysis' was further applied. RESULTS: We identified several potential key players contributing to extracellular matrix remodeling in varicose veins. Specifically, our analysis suggests MFAP5 acting as a master regulator, upstream of integrins, of the cellular network affecting the varicose vein condition. Possible mechanism and pathogenic model were outlined. CONCLUSION: A coherent model proposed incorporates the relevant signaling networks and will hopefully aid further studies on varicose vein pathogenesis.


Subject(s)
Contractile Proteins/genetics , Extracellular Matrix , Glycoproteins/genetics , Varicose Veins/genetics , Adult , DNA Methylation , Female , Gene Expression Profiling , Humans , Intercellular Signaling Peptides and Proteins , Male , Middle Aged , Saphenous Vein
4.
J Cancer Res Clin Oncol ; 144(7): 1289-1300, 2018 Jul.
Article in English | MEDLINE | ID: mdl-29737431

ABSTRACT

PURPOSE: MDM2 inhibitors are promising anticancer agents that induce cell cycle arrest and tumor cells death via p53 reactivation. We examined the influence of Mycoplasma hyorhinis infection on sensitivity of human lung carcinoma cells NCI-H292 to MDM2 inhibitor Nutlin-3. In order to unveil possible mechanisms underlying the revealed effect, we investigated gene expression changes and signal transduction networks activated in NCI-H292 cells in response to mycoplasma infection. METHODS: Sensitivity of NCI-Н292 cells to Nutlin-3 was estimated by resazurin-based cell viability assay. Genome-wide transcriptional profiles of NCI-H292 and NCI-Н292Myc.h cell lines were determined using Illumina Human HT-12 v3 Expression BeadChip. Search for key transcription factors and key node molecules was performed using the geneXplain platform. Ability for anchorage-independent growth was tested by soft agar colony formation assay. RESULTS: NCI-Н292Myc.h cells were shown to be 1.5- and 5.2-fold more resistant to killing by Nutlin-3 at concentrations of 15 and 30 µM than uninfected NCI-Н292 cells (P < 0.05 and P < 0.001, respectively). Transcriptome analysis revealed differential expression of multiple genes involved in cancer progression and metastasis as well as epithelial-mesenchymal transition (EMT). Moreover, we have shown experimentally that NCI-Н292Myc.h cells were more capable of growing and dividing without binding to a substrate. The most likely mechanism explaining the observed changes was found to be TLR4- and IL-1b-mediated activation of NF-κB pathway. CONCLUSIONS: Our results provide evidence that mycoplasma infection is an important factor modulating the effect of MDM2 inhibitors on cancer cells and is able to induce EMT-related changes.


Subject(s)
Imidazoles/pharmacology , Lung Neoplasms/drug therapy , Lung Neoplasms/microbiology , Mycoplasma Infections/physiopathology , Mycoplasma hyorhinis/physiology , Piperazines/pharmacology , Adult , Aged , Aged, 80 and over , Carcinoma, Mucoepidermoid/drug therapy , Carcinoma, Mucoepidermoid/genetics , Carcinoma, Mucoepidermoid/metabolism , Carcinoma, Mucoepidermoid/microbiology , Carcinoma, Non-Small-Cell Lung/drug therapy , Carcinoma, Non-Small-Cell Lung/genetics , Carcinoma, Non-Small-Cell Lung/metabolism , Carcinoma, Non-Small-Cell Lung/microbiology , Cell Line, Tumor , Drug Resistance, Neoplasm , Female , Gene Expression/drug effects , Humans , Lung Neoplasms/genetics , Lung Neoplasms/metabolism , Male , Middle Aged , Mycoplasma Infections/metabolism , Mycoplasma Infections/microbiology , Signal Transduction , Transcriptome , Young Adult
5.
Methods Mol Biol ; 1613: 161-191, 2017.
Article in English | MEDLINE | ID: mdl-28849562

ABSTRACT

In this chapter, we present an approach that allows a causal analysis of multiple "-omics" data with the help of an "upstream analysis" strategy. The goal of this approach is to identify master regulators in gene regulatory networks as potential drug targets for a pathological process. The data analysis strategy includes a state-of-the-art promoter analysis for potential transcription factor (TF)-binding sites using the TRANSFAC® database combined with an analysis of the upstream signal transduction pathways that control the activity of these TFs. When applied to genes that are associated with a switch to a pathological process, the approach identifies potential key molecules (master regulators) that may exert major control over and maintenance of transient stability of the pathological state. We demonstrate this approach on examples of analysis of multi-omics data sets that contain transcriptomics and epigenomics data in cancer. The results of this analysis helped us to better understand the molecular mechanisms of cancer development and cancer drug resistance. Such an approach promises to be very effective for rapid and accurate identification of cancer drug targets with true potential. The upstream analysis approach is implemented as an automatic workflow in the geneXplain platform ( www.genexplain.com ) using the open-source BioUML framework ( www.biouml.org ).


Subject(s)
Computational Biology/methods , DNA/metabolism , Neoplasms/genetics , Transcription Factors/metabolism , Antineoplastic Agents/pharmacology , Antineoplastic Agents/therapeutic use , Binding Sites , DNA/chemistry , Databases, Genetic , Gene Expression Regulation, Neoplastic , Gene Regulatory Networks/drug effects , Humans , Molecular Targeted Therapy , Neoplasms/drug therapy , Promoter Regions, Genetic , Web Browser
6.
EuPA Open Proteom ; 13: 1-13, 2016 Dec.
Article in English | MEDLINE | ID: mdl-29900117

ABSTRACT

We present an "upstream analysis" strategy for causal analysis of multiple "-omics" data. It analyzes promoters using the TRANSFAC database, combines it with an analysis of the upstream signal transduction pathways and identifies master regulators as potential drug targets for a pathological process. We applied this approach to a complex multi-omics data set that contains transcriptomics, proteomics and epigenomics data. We identified the following potential drug targets against induced resistance of cancer cells towards chemotherapy by methotrexate (MTX): TGFalpha, IGFBP7, alpha9-integrin, and the following chemical compounds: zardaverine and divalproex as well as human metabolites such as nicotinamide N-oxide.

7.
Microarrays (Basel) ; 4(2): 270-86, 2015 May 21.
Article in English | MEDLINE | ID: mdl-27600225

ABSTRACT

A strategy is presented that allows a causal analysis of co-expressed genes, which may be subject to common regulatory influences. A state-of-the-art promoter analysis for potential transcription factor (TF) binding sites in combination with a knowledge-based analysis of the upstream pathway that control the activity of these TFs is shown to lead to hypothetical master regulators. This strategy was implemented as a workflow in a comprehensive bioinformatic software platform. We applied this workflow to gene sets that were identified by a novel triclustering algorithm in naphthalene-induced gene expression signatures of murine liver and lung tissue. As a result, tissue-specific master regulators were identified that are known to be linked with tumorigenic and apoptotic processes. To our knowledge, this is the first time that genes of expression triclusters were used to identify upstream regulators.

8.
BMC Bioinformatics ; 14: 241, 2013 Aug 08.
Article in English | MEDLINE | ID: mdl-23924163

ABSTRACT

BACKGROUND: Accurate recognition of regulatory elements in promoters is an essential prerequisite for understanding the mechanisms of gene regulation at the level of transcription. Composite regulatory elements represent a particular type of such transcriptional regulatory elements consisting of pairs of individual DNA motifs. In contrast to the present approach, most available recognition techniques are based purely on statistical evaluation of the occurrence of single motifs. Such methods are limited in application, since the accuracy of recognition is greatly dependent on the size and quality of the sequence dataset. Methods that exploit available knowledge and have broad applicability are evidently needed. RESULTS: We developed a novel method to identify composite regulatory elements in promoters using a library of known examples. In depth investigation of regularities encoded in known composite elements allowed us to introduce a new characteristic measure and to improve the specificity compared with other methods. Tests on an established benchmark and real genomic data show that our method outperforms other available methods based either on known examples or statistical evaluations. In addition to better recognition, a practical advantage of this method is first the ability to detect a high number of different types of composite elements, and second direct biological interpretation of the identified results. The program is available at http://gnaweb.helmholtz-hzi.de/cgi-bin/MCatch/MatrixCatch.pl and includes an option to extend the provided library by user supplied data. CONCLUSIONS: The novel algorithm for the identification of composite regulatory elements presented in this paper was proved to be superior to existing methods. Its application to tissue specific promoters identified several highly specific composite elements with relevance to their biological function. This approach together with other methods will further advance the understanding of transcriptional regulation of genes.


Subject(s)
Computational Biology , Promoter Regions, Genetic , Regulatory Elements, Transcriptional , Regulatory Sequences, Nucleic Acid , Algorithms , Computational Biology/instrumentation , Computational Biology/methods , Gene Expression Regulation , Genomics/instrumentation , Genomics/methods , Nucleotide Motifs
9.
BMC Syst Biol ; 4: 124, 2010 Sep 06.
Article in English | MEDLINE | ID: mdl-20815942

ABSTRACT

BACKGROUND: The study of relationships between human diseases provides new possibilities for biomedical research. Recent achievements on human genetic diseases have stimulated interest to derive methods to identify disease associations in order to gain further insight into the network of human diseases and to predict disease genes. RESULTS: Using about 10000 manually collected causal disease/gene associations, we developed a statistical approach to infer meaningful associations between human morbidities. The derived method clustered cardiometabolic and endocrine disorders, immune system-related diseases, solid tissue neoplasms and neurodegenerative pathologies into prominent disease groups. Analysis of biological functions confirmed characteristic features of corresponding disease clusters. Inference of disease associations was further employed as a starting point for prediction of disease genes. Efforts were made to underpin the validity of results by relevant literature evidence. Interestingly, many inferred disease relationships correspond to known clinical associations and comorbidities, and several predicted disease genes were subjects of therapeutic target research. CONCLUSIONS: Causal molecular mechanisms present a unifying principle to derive methods for disease classification, analysis of clinical disorder associations, and prediction of disease genes. According to the definition of causal disease genes applied in this study, these results are not restricted to genetic disease/gene relationships. This may be particularly useful for the study of long-term or chronic illnesses, where pathological derangement due to environmental or as part of sequel conditions is of importance and may not be fully explained by genetic background.


Subject(s)
Computational Biology/methods , Disease/genetics , Humans , Molecular Sequence Annotation , Reproducibility of Results
10.
Genomics ; 96(3): 129-33, 2010 Sep.
Article in English | MEDLINE | ID: mdl-20600807

ABSTRACT

Identification of different functional elements and their properties is a fundamental need in biomedical research and phylogenetic comparisons of a growing number of sequenced genomes form a solid basis for this task. Most available phylogenetic approaches are focused on searching for individual sequence alterations, responsible for the observed phenotype, or statistically evaluate observed mutations to infer general trends. However, being applied to close genomes such methods suffer from poor statistics of rare mutations and give only (at its best) coarse results concerning the potential functional importance of the nucleotide differences. However, quantifying the changes in physical properties of DNA allows to see the strength of introduced mutations and hence to classify them for further investigations. In this work we present the comparative sequence analysis of two evolutionarily close species-human and chimpanzee. In contrast to previous studies we evaluate changes in melting enthalpy of DNA rather than count nucleotide mismatches. We find that nucleotide mismatches in promoters were apparently introduced in a correlated manner during the course of evolution, so that, for example, the DNA property "melting enthalpy" was retained. Such property conservation of promoters is significantly different from nucleotide conservation, shows significant positional and functional biases, and seems to represent a novel feature of gene regulation.


Subject(s)
DNA/chemistry , Evolution, Molecular , Pan troglodytes/genetics , Promoter Regions, Genetic/genetics , Transition Temperature , Animals , Base Sequence , Computational Biology , Genomics/methods , Humans , Models, Genetic , Sequence Alignment
11.
Exp Dermatol ; 19(3): 297-301, 2010 Mar.
Article in English | MEDLINE | ID: mdl-19961536

ABSTRACT

Keratinocyte differentiation plays a pivotal role in the epidermal barrier. Single keratinocyte differentiation genes have already been studied, but many important constituents of this process may have been missed so far. Gene expression profiling by microarray was carried out in cultured normal human epidermal keratinocytes undergoing confluence-induced differentiation to find novel differentiation genes. Candidate gene lists were established and genes of potential dermatological interest were validated by quantitative reverse transcription polymerase chain reaction and immunohistochemical analysis. Some of these points lead to the identification of counter-regulation of heme oxygenase and biliverdin reductase as well as glutaredoxin and glutathione reductase indicative of potential novel redox signaling in differentiating human keratinocytes. Others indicate a strong concert down-regulation of interleukin-1 signaling at previously unidentified levels during keratinocyte differentiation. We believe that identified genes contribute to a more comprehensive understanding of the complicated epidermal differentiation process and lead to better understanding of dermatological diseases.


Subject(s)
Cell Differentiation/genetics , Gene Expression Profiling , Keratinocytes/cytology , Keratinocytes/metabolism , Gene Regulatory Networks , Genome, Human , Humans , In Vitro Techniques , Oligonucleotide Array Sequence Analysis
12.
Exp Dermatol ; 17(12): 1004-16, 2008 Dec.
Article in English | MEDLINE | ID: mdl-18631249

ABSTRACT

Sphingolipids are important components of the water permeability barrier of the skin. Moreover, ceramides were also shown to influence keratinocyte differentiation and regulate cellular signalling. A confluence-induced differentiation model of normal human keratinocytes was established to allow evaluation of pro- and anti-differentiation effects of exogenous compounds. The effects of phytosphingosine (PS), sphingosine (SO), sphinganine (SA) and their hexanoyl (-C6), stearoyl (-C18) and salicyl (-SLC) derivatives, C12-alkylamine-salicylate (C12-SLC), salicylate (SLC) along with vitamin D3 (VD3) and retinol as control substances were tested in this system. Cytotoxicity assays were carried out to optimize the incubation conditions of compounds and whole genome expression changes were monitored by DNA-microarray on days 0, 1 and 4. Geometric means of gene expression levels of a subset of known keratinocyte differentiation-related genes were calculated from the microarray data to compare effects of the sphingolipid derivatives. Compound treatment-induced transcriptional changes were analysed by the ExPlain software (BIOBASE GmbH). Five of the assayed substances (SA, SO-C6, PS-C6, SO-SLC, PS-SLC) were found to be potent promoters of keratinocyte differentiation compared with VD3, and C12-SLC revealed potential anti-differentiation properties. ExPlain analysis found a different regulatory profile in the computed transcriptional networks of the sphingoid bases versus their -C6 and especially -SLC derivatives suggesting that the change in their keratinocyte differentiation modifying potential is due to a unique effect of the covalent attachment of the salicylic acid. Taken together, these results demonstrate the gene regulatory potential of sphingolipid species that could be valuable for dermatological or cosmetic applications.


Subject(s)
Cell Differentiation/drug effects , Keratinocytes/drug effects , Sphingolipids/pharmacology , Adult , Antigens, Differentiation/genetics , Base Sequence , Binding Sites , Cell Differentiation/genetics , Cell Survival/drug effects , Cell Survival/genetics , Cells, Cultured , Cholecalciferol/pharmacology , Female , Filaggrin Proteins , Gene Expression Profiling , Gene Expression Regulation/drug effects , Glycoproteins/genetics , Humans , Intercellular Signaling Peptides and Proteins , Intermediate Filament Proteins/genetics , Keratin-10/genetics , Keratinocytes/cytology , Keratinocytes/metabolism , Middle Aged , Models, Genetic , Molecular Sequence Data , Oligonucleotide Array Sequence Analysis , Promoter Regions, Genetic/genetics , Salicylates/pharmacology , Transglutaminases/genetics , Vitamin A/pharmacology
13.
Genome Biol ; 9(2): R36, 2008.
Article in English | MEDLINE | ID: mdl-18291023

ABSTRACT

We report an application of machine learning algorithms that enables prediction of the functional context of transcription factor binding sites in the human genome. We demonstrate that our method allowed de novo identification of hepatic nuclear factor (HNF)4alpha binding sites and significantly improved an overall recognition of faithful HNF4alpha targets. When applied to published findings, an unprecedented high number of false positives were identified. The technique can be applied to any transcription factor.


Subject(s)
Artificial Intelligence , Genome, Human , Hepatocyte Nuclear Factor 4/metabolism , Sequence Analysis, DNA/methods , Algorithms , Base Sequence , Binding Sites , Chromatin Immunoprecipitation , Electrophoretic Mobility Shift Assay , Humans , Promoter Regions, Genetic , Repetitive Sequences, Nucleic Acid
14.
J Bioinform Comput Biol ; 5(2B): 561-77, 2007 Apr.
Article in English | MEDLINE | ID: mdl-17636862

ABSTRACT

Variable order Markov models and variable order Bayesian trees have been proposed for the recognition of cis-regulatory elements, and it has been demonstrated that they outperform traditional models such as position weight matrices, Markov models, and Bayesian trees for the recognition of binding sites in prokaryotes. Here, we study to which degree variable order models can improve the recognition of eukaryotic cis-regulatory elements. We find that variable order models can improve the recognition of binding sites of all the studied transcription factors. To ease a systematic evaluation of different model combinations based on problem-specific data sets and allow genomic scans of cis-regulatory elements based on fixed and variable order Markov models and Bayesian trees, we provide the VOMBATserver to the public community.


Subject(s)
Algorithms , Chromosome Mapping/methods , Models, Genetic , Regulatory Elements, Transcriptional/genetics , Sequence Analysis, DNA/methods , Software , Transcription Factors/genetics , Bayes Theorem , Computer Simulation , Markov Chains , Models, Statistical , Pattern Recognition, Automated/methods
15.
J Biosci ; 32(1): 169-80, 2007 Jan.
Article in English | MEDLINE | ID: mdl-17426389

ABSTRACT

Bioinformatics has delivered great contributions to genome and genomics research, without which the world-wide success of this and other global ('omics') approaches would not have been possible. More recently, it has developed further towards the analysis of different kinds of networks thus laying the foundation for comprehensive description, analysis and manipulation of whole living systems in modern "systems biology". The next step which is necessary for developing a systems biology that deals with systemic phenomena is to expand the existing and develop new methodologies that are appropriate to characterize intercellular processes and interactions without omitting the causal underlying molecular mechanisms. Modelling the processes on the different levels of complexity involved requires a comprehensive integration of information on gene regulatory events, signal transduction pathways, protein interaction and metabolic networks as well as cellular functions in the respective tissues / organs.


Subject(s)
Cell Communication , Computational Biology , Metabolic Networks and Pathways , Animals , Cell Physiological Phenomena , Databases, Genetic , Gene Regulatory Networks , Genomics , Hormones/metabolism , Humans , Signal Transduction , Systems Biology
16.
Nucleic Acids Res ; 34(Web Server issue): W591-5, 2006 Jul 01.
Article in English | MEDLINE | ID: mdl-16845077

ABSTRACT

FeatureScan is a software package aiming to reveal novel types of DNA sequence similarity by comparing physico-chemical properties. Thirty-eight different parameters of DNA double strands such as charge, melting enthalpy, conformational parameters and the like are provided. As input FeatureScan requires two sequences, a pattern sequence and a target sequence, search conditions are set by selecting a specific DNA parameter and a threshold value. Search results are displayed in FASTA format and directly linked to external genome databases/browsers (ENSEMBL, NCBI, UCSC). An Internet version of FeatureScan is accessible at http://genome.gbf.de/featurescan/. As part of the HOBIT initiative (http://hobit.sourceforge.net/) FeatureScan is also accessible as a web service at its above home page. Currently, several preloaded genomes are provided at this Internet website (Homo sapiens, Mus musculus, Rattus norvegicus and four strains of Escherichia coli) as target sequences. Standalone executables of FeatureScan are available on request.


Subject(s)
DNA/chemistry , Sequence Analysis, DNA/methods , Software , Animals , Escherichia coli/genetics , Genomics , Humans , Internet , Mice , Rats , Sequence Homology, Nucleic Acid , User-Computer Interface
17.
In Silico Biol ; 5(5-6): 547-55, 2005.
Article in English | MEDLINE | ID: mdl-16268796

ABSTRACT

We present an implementation of the signal theory based approach for detection of novel types of DNA similarity which are based on physical properties of DNA. Systematic study of the sensitivity of the new similarity measure revealed qualitative differences to letter-based similarity. A variety of physical parameters of DNA double strands, which in a straightforward way reflect different kinds of information hidden behind the primary structure of DNA, showed a wide range of recognition power of the signal similarity measure. We applied the novel DNA similarity measure for the analysis of promoters of E.coli genes. We found that promoter similarities revealed by our approach correlate with their transcription regulatory responsivenesses to different antibiotic and osmotic treatments. Accelerated by special hardware for fast Fourier transformations, the method is easily applicable for the analysis of entire eukaryotic genomes in minutes.


Subject(s)
DNA, Bacterial/genetics , Escherichia coli/genetics , Promoter Regions, Genetic , DNA, Bacterial/chemistry , Genes, Bacterial , Models, Genetic , Pattern Recognition, Automated , Sequence Alignment , Thermodynamics
18.
In Silico Biol ; 4(4): 429-44, 2004.
Article in English | MEDLINE | ID: mdl-15506993

ABSTRACT

We report the generation and initial characterization of a large-scale collection of sequences of putative promoter regions (PPRs) of human and mouse genes. Based on our unique collection of 400,225 and 580,209 human and mouse full-length cDNAs, we determined exact transcriptional start sites (TSSs). Using positional information of the TSSs, we could retrieve adjacent sequences as PPRs for 8,793 and 6,875 human and mouse genes, respectively. The positions of the PPRs were 4 kb upstream to previously reported 5'-ends of cDNAs on average, demonstrating that full-length cDNA information is indispensable for this purpose. Among those PPRs supported by experimentally validated TSSs, 3,324 could be paired as mutually homologous genes between human and mouse and were used for the comprehensive comparative studies. The sequence identities in the proximal regions of the TSSs were 45% on average, and 22,794 putative transcription factor binding sites that are conserved between human and mouse were identified. The data resource created in the present work and the results of the sequences' initial characterization should lay the firm foundation for deciphering the transcriptional modulations of human genes. All the data were deposited and made available through a database for comparative studies, DBTSS.


Subject(s)
Computational Biology , Promoter Regions, Genetic/genetics , Sequence Analysis, DNA , Transcription Initiation Site , Animals , DNA, Complementary/genetics , Databases, Nucleic Acid , Gene Library , Genes/genetics , Genome, Human , Humans , Mice
19.
Genome Inform ; 15(2): 276-86, 2004.
Article in English | MEDLINE | ID: mdl-15706513

ABSTRACT

Based on the manual annotation of transcription factors stored in the TRANSFAC database, we developed a library of hidden Markov models (HMM) to represent their DNA-binding domains and used it for a comprehensive classification. The models constructed were applied on the UniProt/Swiss-Prot database, leading to a systematic classification of further DNA-binding protein entries. The HMM library obtained can be used to classify any newly discovered transcription factor according to its DNA-binding domain and, thus, to generate hypotheses about its DNA-binding specificity.


Subject(s)
DNA-Binding Proteins/chemistry , DNA-Binding Proteins/classification , Genome , Transcription Factors , Binding Sites , Computational Biology , Databases, Factual , Databases, Protein , Helix-Turn-Helix Motifs , Response Elements , Sequence Alignment , Sequence Analysis, Protein , T-Box Domain Proteins , Transcription Factors/chemistry , Transcription Factors/classification
20.
In Silico Biol ; 3(1-2): 145-71, 2003.
Article in English | MEDLINE | ID: mdl-12954097

ABSTRACT

Known transcription regulatory signals which generally act as transcription factor binding sites (TFs) differ significantly in their base composition. Therefore, their occurrence in a genome largely depends on the local base composition. In an attempt to initiate an all human genome analysis for the occurrence of potential TFs, we systematically analyzed the GC-content of distinct functional regions (e. g., upstream and downstream gene regions, exons, long and short introns, repetitive elements) and correlated the frequencies of potential binding sites of a representative set of TFs in these regions. For these analyses, we used the pattern collection of the TRANSFAC database on transcriptional regulation, the information about functionally relevant combinations of them from the database TRANSCompel, and our new resource, TRANSGenomeTM, which provides an overall annotation of the human genome with emphasis on its regulatory characteristics. We show that the occurrence of sequence patterns with regulatory potential may be supported by, but cannot be fully explained by either the GC content of a whole chromosome or its putative promoter regions, nor by the information content of the patterns. Several patterns, HNF-3, NFAT, and GC box, show a clear overrepresentation in all promoter groups as well as in all chromosomes. Other patterns, like E2F and CRE-BP1, are underrepresented in all promoter groups as well as in all chromosomes in comparison with random sequences. Simultaneously, both patterns are over-represented in promoters in comparison with repetitive elements. We define several structural characteristics of the proximal promoters that differentiate them from other functional genomic regions. Two well-known promoter elements, GC- and TATA-boxes, are statistically enriched in promoters in comparison with random sequences, repetitive elements and exons. Altogether, our findings provide insights into the macroheterogeneity amongst the individual chromosomes, into the microheterogeneity among different functional regions of individual chromosomes, contribute to further understanding of structural organization of gene regulatory regions, and give first hints on the development of regulatory features during evolution.


Subject(s)
Chromosomes, Human/genetics , Genome, Human , Models, Genetic , Regulatory Sequences, Nucleic Acid , Transcription, Genetic/genetics , Animals , Base Composition , Chromosome Mapping , DNA/chemistry , DNA/genetics , Humans , Mathematics , Models, Statistical , Promoter Regions, Genetic/genetics , Sensitivity and Specificity , Species Specificity
SELECTION OF CITATIONS
SEARCH DETAIL
...