Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 23
Filter
Add more filters










Publication year range
1.
Int J Mol Sci ; 23(20)2022 Oct 14.
Article in English | MEDLINE | ID: mdl-36293128

ABSTRACT

Studies on hereditary fixation of the tame-behavior phenotype during animal domestication remain relevant and important because they are of both basic research and applied significance. In model animals, gray rats Rattus norvegicus bred for either an enhancement or reduction in defensive response to humans, for the first time, we used high-throughput RNA sequencing to investigate differential expression of genes in tissue samples from the tegmental region of the midbrain in 2-month-old rats showing either tame or aggressive behavior. A total of 42 differentially expressed genes (DEGs; adjusted p-value < 0.01 and fold-change > 2) were identified, with 20 upregulated and 22 downregulated genes in the tissue samples from tame rats compared with aggressive rats. Among them, three genes encoding transcription factors (TFs) were detected: Ascl3 was upregulated, whereas Fos and Fosb were downregulated in tissue samples from the brains of tame rats brain. Other DEGs were annotated as associated with extracellular matrix components, transporter proteins, the neurotransmitter system, signaling molecules, and immune system proteins. We believe that these DEGs encode proteins that constitute a multifactorial system determining the behavior for which the rats have been artificially selected. We demonstrated that several structural subtypes of E-box motifs­known as binding sites for many developmental TFs of the bHLH class, including the ASCL subfamily of TFs­are enriched in the set of promoters of the DEGs downregulated in the tissue samples of tame rats'. Because ASCL3 may act as a repressor on target genes of other developmental TFs of the bHLH class, we hypothesize that the expression of TF gene Ascl3 in tame rats indicates longer neurogenesis (as compared to aggressive rats), which is a sign of neoteny and domestication. Thus, our domestication model shows a new function of TF ASCL3: it may play the most important role in behavioral changes in animals.


Subject(s)
Behavior, Animal , Domestication , Humans , Animals , Rats , Infant , Behavior, Animal/physiology , Transcription Factors/genetics , Aggression/physiology , Sequence Analysis, RNA , Gene Expression Profiling
2.
Front Plant Sci ; 13: 942710, 2022.
Article in English | MEDLINE | ID: mdl-36061801

ABSTRACT

Having DNA-binding profiles for a sufficient number of genome-encoded transcription factors (TFs) opens up the perspectives for systematic evaluation of the upstream regulators for the gene lists. Plant Cistrome database, a large collection of TF binding profiles detected using the DAP-seq method, made it possible for Arabidopsis. Here we re-processed raw DAP-seq data with MACS2, the most popular peak caller that leads among other ones according to quality metrics. In the benchmarking study, we confirmed that the improved collection of TF binding profiles supported a more precise gene list enrichment procedure, and resulted in a more relevant ranking of potential upstream regulators. Moreover, we consistently recovered the TF binding profiles that were missing in the previous collection of DAP-seq peak sets. We developed the CisCross web service (https://plamorph.sysbio.ru/ciscross/) that gives more flexibility in the analysis of potential upstream TF regulators for Arabidopsis thaliana genes.

3.
Front Plant Sci ; 13: 938545, 2022.
Article in English | MEDLINE | ID: mdl-35968123

ABSTRACT

Position weight matrix (PWM) is the traditional motif model representing the transcription factor (TF) binding sites. It proposes that the positions contribute independently to TFs binding affinity, although this hypothesis does not fit the data perfectly. This explains why PWM hits are missing in a substantial fraction of ChIP-seq peaks. To study various modes of the direct binding of plant TFs, we compiled the benchmark collection of 111 ChIP-seq datasets for Arabidopsis thaliana, and applied the traditional PWM, and two alternative motif models BaMM and SiteGA, proposing the dependencies of the positions. The variation in the stringency of the recognition thresholds for the models proposed that the hits of PWM, BaMM, and SiteGA models are associated with the sites of high/medium, any, and low affinity, respectively. At the medium recognition threshold, about 60% of ChIP-seq peaks contain PWM hits consisting of conserved core consensuses, while BaMM and SiteGA provide hits for an additional 15% of peaks in which a weaker core consensus is compensated through intra-motif dependencies. The presence/absence of these dependencies in the motifs of alternative/traditional models was confirmed by the dependency logo DepLogo visualizing the position-wise partitioning of the alignments of predicted sites. We exemplify the detailed analysis of ChIP-seq profiles for plant TFs CCA1, MYC2, and SEP3. Gene ontology (GO) enrichment analysis revealed that among the three motif models, the SiteGA had the highest portions of genes with the significantly enriched GO terms among all predicted genes. We showed that both alternative motif models provide for traditional PWM greater extensions in predicted sites for TFs MYC2/SEP3 with condition/tissue specific functions, compared to those for TF CCA1 with housekeeping functions. Overall, the combined application of standard and alternative motif models is beneficial to detect various modes of the direct TF-DNA interactions in the maximal portion of ChIP-seq loci.

4.
Int J Mol Sci ; 23(16)2022 Aug 11.
Article in English | MEDLINE | ID: mdl-36012247

ABSTRACT

(1) Background: The widespread application of ChIP-seq technology requires annotation of cis-regulatory modules through the search of co-occurred motifs. (2) Methods: We present the web server Motifs Co-Occurrence Tool (Web-MCOT) that for a single ChIP-seq dataset detects the composite elements (CEs) or overrepresented homo- and heterotypic pairs of motifs with spacers and overlaps, with any mutual orientations, uncovering various similarities to recognition models within pairs of motifs. The first (Anchor) motif in CEs respects the target transcription factor of the ChIP-seq experiment, while the second one (Partner) can be defined either by a user or a public library of Partner motifs being processed. (3) Results: Web-MCOT computes the significances of CEs without reference to motif conservation and those with more conserved Partner and Anchor motifs. Graphic results show histograms of CE abundance depending on orientations of motifs, overlap and spacer lengths; logos of the most common CE structural types with an overlap of motifs, and heatmaps depicting the abundance of CEs with one motif possessing higher conservation than another. (4) Conclusions: Novel capacities of Web-MCOT allow retrieving from a single ChIP-seq dataset with maximal information on the co-occurrence of motifs and potentiates planning of next ChIP-seq experiments.


Subject(s)
Chromatin Immunoprecipitation Sequencing , Transcription Factors , Binding Sites , Chromatin Immunoprecipitation/methods , Transcription Factors/genetics
5.
Curr Opin Plant Biol ; 63: 102058, 2021 10.
Article in English | MEDLINE | ID: mdl-34098218

ABSTRACT

Innovative omics technologies, advanced bioinformatics, and machine learning methods are rapidly becoming integral tools for plant functional genomics, with tremendous recent advances made in this field. In transcriptional regulation, an initial lag in the accumulation of plant omics data relative to that of animals stimulated the development of computational methods capable of extracting maximum information from the available data sets. Recent comprehensive studies of transcription factor-binding profiles in Arabidopsis and maize and the accumulation of uniformly processed omics data in public databases have brought plant biologists into the big leagues, with many cutting-edge methods available. Here, we summarize the state-of-the-art bioinformatics approaches used to predict or infer the cis-regulatory code behind transcriptional gene regulation, focusing on their plant research applications.


Subject(s)
Arabidopsis , Gene Expression Regulation , Animals , Arabidopsis/genetics , Computational Biology
6.
Int J Mol Sci ; 21(23)2020 Dec 05.
Article in English | MEDLINE | ID: mdl-33291385

ABSTRACT

We analyzed the whole-genome experimental maps of nucleosomes in Drosophila melanogaster and classified genes by the expression level in S2 cells (RPKM value, reads per kilobase million) as well as the number of tissues in which a gene was expressed (breadth of expression, BoE). Chromatin in 5'-regions of genes we classified on four states according to the hidden Markov model (4HMM). Only the Aquamarine chromatin state we considered as Active, while the rest three states we defined as Non-Active. Surprisingly, about 20/40% of genes with 5'-regions mapped to Active/Non-Active chromatin possessed the minimal/at least modest RPKM and BoE. We found that regardless of RPKM/BoE the genes of Active chromatin possessed the regular nucleosome arrangement in 5'-regions, while genes of Non-Active chromatin did not show respective specificity. Only for genes of Active chromatin the RPKM/BoE positively correlates with the number of nucleosome sites upstream/around TSS and negatively with that downstream TSS. We propose that for genes of Active chromatin, regardless of RPKM value and BoE the nucleosome arrangement in 5'-regions potentiates transcription, while for genes of Non-Active chromatin, the transcription machinery does not require the substantial support from nucleosome arrangement to influence gene expression.


Subject(s)
Chromatin/genetics , Chromatin/metabolism , Drosophila melanogaster/genetics , Drosophila melanogaster/metabolism , Interphase , Nucleosomes/metabolism , Transcription Initiation Site , Transcription, Genetic , Animals , Chromatin Assembly and Disassembly , Chromosome Mapping , Gene Expression Regulation , Promoter Regions, Genetic , Transcription Factors/metabolism
7.
Genes (Basel) ; 11(4)2020 04 11.
Article in English | MEDLINE | ID: mdl-32290448

ABSTRACT

The Drosophila melanogaster polytene chromosomes are the best model for studying the genome organization during interphase. Despite of the long-term studies available on genetic organization of polytene chromosome bands and interbands, little is known regarding long gene location on chromosomes. To analyze it, we used bioinformatic approaches and characterized genome-wide distribution of introns in gene bodies and in different chromatin states, and using fluorescent in situ hybridization we juxtaposed them with the chromosome structures. Short introns up to 2 kb in length are located in the bodies of housekeeping genes (grey bands or lazurite chromatin). In the group of 70 longest genes in the Drosophila genome, 95% of total gene length accrues to introns. The mapping of the 15 long genes showed that they could occupy extended sections of polytene chromosomes containing band and interband series, with promoters located in the interband fragments (aquamarine chromatin). Introns (malachite and ruby chromatin) in polytene chromosomes form independent bands, which can contain either both introns and exons or intron material only. Thus, a novel type of the gene arrangement in polytene chromosomes was discovered; peculiarities of such genetic organization are discussed.


Subject(s)
Chromatin , Drosophila Proteins/genetics , Drosophila melanogaster/genetics , Genome , Introns , Polytene Chromosomes , Animals
8.
Chromosoma ; 129(1): 25-44, 2020 03.
Article in English | MEDLINE | ID: mdl-31820086

ABSTRACT

In Drosophila melanogaster, the chromatin of interphase polytene chromosomes appears as alternating decondensed interbands and dense black or thin gray bands. Recently, we uncovered four principle chromatin states (4НММ model) in the fruit fly, and these were matched to the structures observed in polytene chromosomes. Ruby/malachite chromatin states form black bands containing developmental genes, whereas aquamarine chromatin corresponds to interbands enriched with 5' regions of ubiquitously expressed genes. Lazurite chromatin supposedly forms faint gray bands and encompasses the bodies of housekeeping genes. In this report, we test this idea using the X chromosome as the model and MSL1 as a protein marker of the lazurite chromatin. Our bioinformatic analysis indicates that in the X chromosome, it is only the lazurite chromatin that is simultaneously enriched for the proteins and histone marks associated with exons, transcription elongation, and dosage compensation. As a result of FISH and EM mapping of a dosage compensation complex subunit, MSL1, we for the first time provide direct evidence that lazurite chromatin forms faint gray bands. Our analysis proves that overall most of housekeeping genes typically span from the interbands (5' region of the gene) to the gray band (gene body). More rarely, active lazurite chromatin and inactive malachite/ruby chromatin may be found within a common band, where both the housekeeping and the developmental genes reside together.


Subject(s)
Chromosome Banding , Drosophila melanogaster/genetics , Genes, Essential , Open Reading Frames , Polytene Chromosomes/genetics , Animals , Arabidopsis Proteins/metabolism , Chromatin/genetics , Computational Biology/methods , Drosophila Proteins/metabolism , Female , Gene Rearrangement , Histones/metabolism , In Situ Hybridization, Fluorescence , Ion Channels/metabolism , Male , Mutation , Protein Serine-Threonine Kinases/metabolism , Sex Chromosomes
9.
Curr Genomics ; 19(3): 179-191, 2018 Apr.
Article in English | MEDLINE | ID: mdl-29606905

ABSTRACT

This mini-review is devoted to the problem genetic meaning of main polytene chromosome structures - bands and interbands. Generally, densely packed chromatin forms black bands, moderately condensed regions form grey loose bands, whereas decondensed regions of the genome appear as interbands. Recent progress in the annotation of the Drosophila genome and epigenome has made it possible to compare the banding pattern and the structural organization of genes, as well as their activity. This was greatly aided by our ability to establish the borders of bands and interbands on the physical map, which allowed to perform comprehensive side-by-side comparisons of cytology, genetic and epigenetic maps and to uncover the association between the morphological structures and the functional domains of the genome. These studies largely conclude that interbands 5'-ends of housekeeping genes that are active across all cell types. Interbands are enriched with proteins involved in transcription and nucleosome remodeling, as well as with active histone modifications. Notably, most of the replication origins map to interband regions. As for grey loose bands adjacent to interbands, they typically host the bodies of house-keeping genes. Thus, the bipartite structure composed of an interband and an adjacent grey band functions as a standalone genetic unit. Finally, black bands harbor tissue-specific genes with narrow temporal and tissue expression profiles. Thus, the uniform and permanent activity of interbands combined with the inactivity of genes in bands forms the basis of the universal banding pattern observed in various Drosophila tissues.

10.
Nat Commun ; 9(1): 875, 2018 02 28.
Article in English | MEDLINE | ID: mdl-29491423

ABSTRACT

Spatial organization of signalling events of the phytohormone auxin is fundamental for maintaining a dynamic transition from plant stem cells to differentiated descendants. The cambium, the stem cell niche mediating wood formation, fundamentally depends on auxin signalling but its exact role and spatial organization is obscure. Here we show that, while auxin signalling levels increase in differentiating cambium descendants, a moderate level of signalling in cambial stem cells is essential for cambium activity. We identify the auxin-dependent transcription factor ARF5/MONOPTEROS to cell-autonomously restrict the number of stem cells by directly attenuating the activity of the stem cell-promoting WOX4 gene. In contrast, ARF3 and ARF4 function as cambium activators in a redundant fashion from outside of WOX4-expressing cells. Our results reveal an influence of auxin signalling on distinct cambium features by specific signalling components and allow the conceptual integration of plant stem cell systems with distinct anatomies.


Subject(s)
Arabidopsis Proteins/metabolism , Arabidopsis/growth & development , Cambium/cytology , DNA-Binding Proteins/metabolism , Homeodomain Proteins/metabolism , Indoleacetic Acids/metabolism , Nuclear Proteins/metabolism , Transcription Factors/metabolism , Arabidopsis/genetics , Arabidopsis/metabolism , Arabidopsis Proteins/biosynthesis , Arabidopsis Proteins/genetics , Cell Proliferation/physiology , Gene Expression Profiling , Gene Expression Regulation, Plant/genetics , Homeodomain Proteins/biosynthesis , Homeodomain Proteins/genetics , Plant Growth Regulators/metabolism , Plants, Genetically Modified/metabolism , Signal Transduction , Stem Cells/cytology , Wood/cytology , Wood/growth & development
11.
Curr Genomics ; 18(2): 214-226, 2017 Apr.
Article in English | MEDLINE | ID: mdl-28367077

ABSTRACT

BACKGROUND: Recently, we analyzed genome-wide protein binding data for the Drosophila cell lines S2, Kc, BG3 and Cl.8 (modENCODE Consortium) and identified a set of 12 proteins enriched in the regions corresponding to interbands of salivary gland polytene chromosomes. Using these data, we developed a bioinformatic pipeline that partitioned the Drosophila genome into four chromatin types that we hereby refer to as aquamarine, lazurite, malachite and ruby. RESULTS: Here, we describe the properties of these chromatin types across different cell lines. We show that aquamarine chromatin tends to harbor transcription start sites (TSSs) and 5' untranslated regions (5'UTRs) of the genes, is enriched in diverse "open" chromatin proteins, histone modifications, nucleosome remodeling complexes and transcription factors. It encompasses most of the tRNA genes and shows enrichment for non-coding RNAs and miRNA genes. Lazurite chromatin typically encompasses gene bodies. It is rich in proteins involved in transcription elongation. Frequency of both point mutations and natural deletion breakpoints is elevated within lazurite chromatin. Malachite chromatin shows higher frequency of insertions of natural transposons. Finally, ruby chromatin is enriched for proteins and histone modifications typical for the "closed" chromatin. Ruby chromatin has a relatively low frequency of point mutations and is essentially devoid of miRNA and tRNA genes. Aquamarine and ruby chromatin types are highly stable across cell lines and have contrasting properties. Lazurite and malachite chromatin types also display characteristic protein composition, as well as enrichment for specific genomic features. We found that two types of chromatin, aquamarine and ruby, retain their complementary protein patterns in four Drosophila cell lines.

12.
J Bioinform Comput Biol ; 14(2): 1641009, 2016 04.
Article in English | MEDLINE | ID: mdl-27122321

ABSTRACT

Auxin is the major regulator of plant growth and development. It regulates gene expression via a family of transcription factors (ARFs) that bind to auxin responsive elements (AuxREs) in the gene promoters. The canonical AuxREs found in regulatory regions of many auxin responsive genes contain the TGTCTC core motif, whereas ARF binding site is a degenerate TGTCNN with TGTCGG strongly preferred. Thereby two questions arise: which TGTCNN variants are functional AuxRE cores and whether different TGTCNN variants have distinct functional roles? In this study, we performed meta-analysis of microarray data to reveal TGTCNN variants essential for auxin response and to characterize their functional features. Our results indicate that four TGTCNN motifs (TGTCTC, TGTCCC, TGTCGG, and TGTCTG) are associated with auxin up-regulation and two (TGTCGG, TGTCAT) with auxin down-regulation, but to a lesser extent. The genes having some of these motifs in their regulatory regions showed time-specific auxin response. Functional annotation of auxin up- and down-regulated genes also revealed GO terms specific for the auxin-regulated genes with certain TGTCNN variants in their promoters. Our results provide an idea that various TGTCNN motifs may play distinct roles in the auxin regulation of gene expression.


Subject(s)
Arabidopsis Proteins/genetics , Arabidopsis/genetics , Indoleacetic Acids/metabolism , Response Elements , 5' Untranslated Regions , Arabidopsis/metabolism , Arabidopsis Proteins/metabolism , Binding Sites , Gene Expression Regulation, Plant , Nucleotide Motifs , Oligonucleotide Array Sequence Analysis , Plant Growth Regulators/metabolism , Transcriptome
13.
Front Plant Sci ; 7: 2044, 2016.
Article in English | MEDLINE | ID: mdl-28119721

ABSTRACT

The plant hormone ethylene regulates numerous developmental processes and stress responses. Ethylene signaling proceeds via a linear pathway, which activates transcription factor (TF) EIN3, a primary transcriptional regulator of ethylene response. EIN3 influences gene expression upon binding to a specific sequence in gene promoters. This interaction, however, might be considerably affected by additional co-factors. In this work, we perform whole genome bioinformatics study to identify the impact of epigenetic factors in EIN3 functioning. The analysis of publicly available ChIP-Seq data on EIN3 binding in Arabidopsis thaliana showed bimodality of distribution of EIN3 binding regions (EBRs) in gene promoters. Besides a sharp peak in close proximity to transcription start site, which is a common binding region for a wide variety of TFs, we found an additional extended peak in the distal promoter region. We characterized all EBRs with respect to the epigenetic status appealing to previously published genome-wide map of nine chromatin states in A. thaliana. We found that the implicit distal peak was associated with a specific chromatin state (referred to as chromatin state 4 in the primary source), which was just poorly represented in the pronounced proximal peak. Intriguingly, EBRs corresponding to this chromatin state 4 were significantly associated with ethylene response, unlike the others representing the overwhelming majority of EBRs related to the explicit proximal peak. Moreover, we found that specific EIN3 binding sequences predicted with previously described model were enriched in the EBRs mapped to the chromatin state 4, but not to the rest ones. These results allow us to conclude that the interplay of genetic and epigenetic factors might cause the distinct modes of EIN3 regulation.

14.
Int J Genomics ; 2015: 260159, 2015.
Article in English | MEDLINE | ID: mdl-26417590

ABSTRACT

The expression level of each gene is controlled by its regulatory regions, which determine the precise regulation in a tissue-specific manner, according to the developmental stage of the body and the necessity of a response to external stimuli. Nucleotide substitutions in regulatory gene regions may modify the affinity of transcription factors to their specific DNA binding sites, affecting the transcription rates of genes. In our previous research, we found that genes controlling the sensory perception of smell and genes involved in antigen processing and presentation were overrepresented significantly among genes with high SNP contents in their promoter regions. The goal of our study was to reveal functional features of human genes containing extremely small numbers of SNPs in promoter regions. Two functional groups were found to be overrepresented among genes whose promoters did not contain SNPs: (1) genes involved in gene-specific transcription and (2) genes controlling chromatin organization. We revealed that the 5'-regulatory regions of genes encoding transcription factors and chromatin-modifying proteins were characterized by reduced genetic variability. One important exception from this rule refers to genes encoding transcription factors with zinc-coordinating DNA-binding domains (DBDs), which underwent extensive expansion in vertebrates, particularly, in primate evolution. Hence, we obtained new evidence for evolutionary forces shaping variability in 5'-regulatory regions of genes.

15.
Front Psychol ; 5: 247, 2014.
Article in English | MEDLINE | ID: mdl-24715883

ABSTRACT

The molecular mechanism of olfactory cognition is very complicated. Olfactory cognition is initiated by olfactory receptor proteins (odorant receptors), which are activated by olfactory stimuli (ligands). Olfactory receptors are the initial player in the signal transduction cascade producing a nerve impulse, which is transmitted to the brain. The sensitivity to a particular ligand depends on the expression level of multiple proteins involved in the process of olfactory cognition: olfactory receptor proteins, proteins that participate in signal transduction cascade, etc. The expression level of each gene is controlled by its regulatory regions, and especially, by the promoter [a region of DNA about 100-1000 base pairs long located upstream of the transcription start site (TSS)]. We analyzed single nucleotide polymorphisms using human whole-genome data from the 1000 Genomes Project and revealed an extremely high level of single nucleotide polymorphisms in promoter regions of olfactory receptor genes and HLA genes. We hypothesized that the high level of polymorphisms in olfactory receptor promoters was responsible for the diversity in regulatory mechanisms controlling the expression levels of olfactory receptor proteins. Such diversity of regulatory mechanisms may cause the great variability of olfactory cognition of numerous environmental olfactory stimuli perceived by human beings (air pollutants, human body odors, odors in culinary etc.). In turn, this variability may provide a wide range of emotional and behavioral reactions related to the vast variety of olfactory stimuli.

16.
BMC Genomics ; 15: 80, 2014 Jan 29.
Article in English | MEDLINE | ID: mdl-24472686

ABSTRACT

BACKGROUND: ChIP-Seq is widely used to detect genomic segments bound by transcription factors (TF), either directly at DNA binding sites (BSs) or indirectly via other proteins. Currently, there are many software tools implementing different approaches to identify TFBSs within ChIP-Seq peaks. However, their use for the interpretation of ChIP-Seq data is usually complicated by the absence of direct experimental verification, making it difficult both to set a threshold to avoid recognition of too many false-positive BSs, and to compare the actual performance of different models. RESULTS: Using ChIP-Seq data for FoxA2 binding loci in mouse adult liver and human HepG2 cells we compared FoxA binding-site predictions for four computational models of two fundamental classes: pattern matching based on existing training set of experimentally confirmed TFBSs (oPWM and SiteGA) and de novo motif discovery (ChIPMunk and diChIPMunk). To properly select prediction thresholds for the models, we experimentally evaluated affinity of 64 predicted FoxA BSs using EMSA that allows safely distinguishing sequences able to bind TF. As a result we identified thousands of reliable FoxA BSs within ChIP-Seq loci from mouse liver and human HepG2 cells. It was found that the performance of conventional position weight matrix (PWM) models was inferior with the highest false positive rate. On the contrary, the best recognition efficiency was achieved by the combination of SiteGA & diChIPMunk/ChIPMunk models, properly identifying FoxA BSs in up to 90% of loci for both mouse and human ChIP-Seq datasets. CONCLUSIONS: The experimental study of TF binding to oligonucleotides corresponding to predicted sites increases the reliability of computational methods for TFBS-recognition in ChIP-Seq data analysis. Regarding ChIP-Seq data interpretation, basic PWMs have inferior TFBS recognition quality compared to the more sophisticated SiteGA and de novo motif discovery methods. A combination of models from different principles allowed identification of proper TFBSs.


Subject(s)
Chromatin Immunoprecipitation , Computational Biology , Transcription Factors/metabolism , Animals , Binding Sites , Mice
17.
J Biomol Struct Dyn ; 32(1): 115-26, 2014.
Article in English | MEDLINE | ID: mdl-23384242

ABSTRACT

Similar to regularly spaced nucleosomes in chromatin, long tandem DNA arrays are composed of regularly alternating monomers that have almost identical primary DNA structures. Such a similarity in the structural organization makes these arrays especially interesting for studying the role of intrinsic DNA preferences in nucleosome positioning. We have studied the nucleosome formation potential of DNA tandem repeat families with different monomer lengths (ML). In total, 165 plant tandem repeat families from the PlantSat database (http://w3lamc.umbr.cas.cz/PlantSat/) were divided into two classes based on the number of nucleosome repeats in one DNA monomer. For predicting nucleosome formation potential, we developed the Phase method, which combines the advantages of multiple bioinformatics models. The Phase method was able to distinguish interfamily differences and intrafamily monomer variation and identify the influence of nucleotide context on nucleosome formation potential. Three main types of nucleosome arrangement in DNA tandem repeat arrays--regular, partially regular (partial), and flexible--were distinguished among a great variety of Phase profiles. The regular type, in which all nucleosomes of the monomer array are positioned in a context-dependent manner, is the most representative type of the class 1 families, with ML equal to or a multiple of the nucleosome repeat length (NRL). In the partially regular type, nucleotide context influences the positioning of only a subset of nucleosomes. The influence of the nucleotide context on nucleosome positioning has the least effect in the flexible type, which contains the greatest number of families (65). The majority of these families belong to class 2 and have nonmultiple ML to NRL ratios.


Subject(s)
Nucleosomes/genetics , Nucleotides/chemistry , Plants/genetics , Chromatin Assembly and Disassembly , Computational Biology , DNA, Plant/chemistry , DNA, Plant/genetics , Databases, Genetic , Plants/ultrastructure , Tandem Repeat Sequences
18.
BMC Genomics ; 15 Suppl 12: S4, 2014.
Article in English | MEDLINE | ID: mdl-25563792

ABSTRACT

Auxin responsive elements (AuxRE) were found in upstream regions of target genes for ARFs (Auxin response factors). While Chip-seq data for most of ARFs are still unavailable, prediction of potential AuxRE is restricted by consensus models that detect too many false positive sites. Using sequence analysis of experimentally proven AuxREs, we revealed both an extended nucleotide context pattern for AuxRE itself and three distinct types of its coupling motifs (Y-patch, AuxRE-like, and ABRE-like), which together with AuxRE may form the composite elements. Computational analysis of the genome-wide distribution of the predicted AuxREs and their impact on auxin responsive gene expression allowed us to conclude that: (1) AuxREs are enriched around the transcription start site with the maximum density in 5'UTR; (2) AuxREs mediate auxin responsive up-regulation, not down-regulation. (3) Directly oriented single AuxREs and reverse multiple AuxREs are mostly associated with auxin responsiveness. In the composite AuxRE elements associated with auxin response, ABRE-like and Y-patch are 5'-flanking or overlapping AuxRE, whereas AuxRE-like motif is 3'-flanking. The specificity in location and orientation of the coupling elements suggests them as potential binding sites for ARFs partners.


Subject(s)
Arabidopsis/genetics , Indoleacetic Acids/metabolism , Response Elements , Arabidopsis/metabolism , Gene Expression Regulation, Plant , Genome, Plant , Genomics , Nucleotide Motifs
19.
Methods Mol Biol ; 760: 251-67, 2011.
Article in English | MEDLINE | ID: mdl-21780002

ABSTRACT

The recognition of transcription factor binding sites (TFBSs) is the first step on the way to deciphering the DNA regulatory code. A large variety of computational approaches and corresponding in silico tools for TFBS recognition are available, each having their own advantages and shortcomings. This chapter provides a brief tutorial to assist end users in the application of these tools for functional characterization of genes.


Subject(s)
Computational Biology/methods , Computer Simulation , Models, Biological , Transcription Factors/metabolism , Animals , Binding Sites/genetics , Humans , Information Storage and Retrieval , Internet , Software
20.
BMC Bioinformatics ; 8: 481, 2007 Dec 19.
Article in English | MEDLINE | ID: mdl-18093302

ABSTRACT

BACKGROUND: Reliable transcription factor binding site (TFBS) prediction methods are essential for computer annotation of large amount of genome sequence data. However, current methods to predict TFBSs are hampered by the high false-positive rates that occur when only sequence conservation at the core binding-sites is considered. RESULTS: To improve this situation, we have quantified the performance of several Position Weight Matrix (PWM) algorithms, using exhaustive approaches to find their optimal length and position. We applied these approaches to bio-medically important TFBSs involved in the regulation of cell growth and proliferation as well as in inflammatory, immune, and antiviral responses (NF-kappaB, ISGF3, IRF1, STAT1), obesity and lipid metabolism (PPAR, SREBP, HNF4), regulation of the steroidogenic (SF-1) and cell cycle (E2F) genes expression. We have also gained extra specificity using a method, entitled SiteGA, which takes into account structural interactions within TFBS core and flanking regions, using a genetic algorithm (GA) with a discriminant function of locally positioned dinucleotide (LPD) frequencies. To ensure a higher confidence in our approach, we applied resampling-jackknife and bootstrap tests for the comparison, it appears that, optimized PWM and SiteGA have shown similar recognition performances. Then we applied SiteGA and optimized PWMs (both separately and together) to sequences in the Eukaryotic Promoter Database (EPD). The resulting SiteGA recognition models can now be used to search sequences for BSs using the web tool, SiteGA. Analysis of dependencies between close and distant LPDs revealed by SiteGA models has shown that the most significant correlations are between close LPDs, and are generally located in the core (footprint) region. A greater number of less significant correlations are mainly between distant LPDs, which spanned both core and flanking regions. When SiteGA and optimized PWM models were applied together, this substantially reduced false positives at least at higher stringencies. CONCLUSION: Based on this analysis, SiteGA adds substantial specificity even to optimized PWMs and may be considered for large-scale genome analysis. It adds to the range of techniques available for TFBS prediction, and EPD analysis has led to a list of genes which appear to be regulated by the above TFs.


Subject(s)
Algorithms , DNA/genetics , Protein Interaction Mapping/methods , Sequence Alignment/methods , Sequence Analysis, DNA/methods , Transcription Factors/genetics , Base Sequence , Binding Sites , Computer Simulation , Discriminant Analysis , Models, Genetic , Molecular Sequence Data , Protein Binding
SELECTION OF CITATIONS
SEARCH DETAIL
...