Search | VHL Regional Portal

Structural features of DNA that determine RNA polymerase II core promoter.

Il'icheva, Irina A; Khodikov, Mingian V; Poptsova, Maria S; Nechipurenko, Dmitry Yu; Nechipurenko, Yury D; Grokhovsky, Sergei L.

BMC Genomics ; 17(1): 973, 2016 11 25.

Article in English | MEDLINE | ID: mdl-27884105

ABSTRACT

BACKGROUND: The general structure and action of all eukaryotic and archaeal RNA polymerases machinery have an astonishing similarity despite the diversity of core promoter sequences in different species. The goal of our work is to find common characteristics of DNA region that define it as a promoter for the RNA polymerase II (Pol II). RESULTS: The profiles of a large number of physical and structural characteristics, averaged over representative sets of the Pol II minimal core promoters of the evolutionary divergent species from animals, plants and unicellular fungi were analysed. In addition to the characteristics defined at the base-pair steps, we, for the first time, use profiles of the ultrasonic cleavage and DNase I cleavage indexes, informative for internal properties of each complementary strand. CONCLUSIONS: DNA of the core promoters of metazoans and Schizosaccharomyces pombe has similar structural organization. Its mechanical and 3D structural characteristics have singular properties at the positions of TATA-box. The minor groove is broadened and conformational motion is decreased in that region. Special characteristics of conformational behavior are revealed in metazoans at the region, which connects the end of TATA-box and the transcription start site (TSS). The intensities of conformational motions in the complementary strands are periodically changed in opposite phases. They are noticeable, best of all, in mammals. Such conformational features are lacking in the core promoters of S. pombe. The profiles of Saccharomyces cerevisiae core promoters significantly differ: their singular region is shifted down thus pointing to the uniqueness of their structural organization. Obtained results may be useful in genetic engineering for artificial modulation of the promoter strength.

Subject(s)

Promoter Regions, Genetic , RNA Polymerase II/chemistry , RNA Polymerase II/genetics , Animals , Base Sequence , DNA Cleavage , Genetic Variation , Humans , Nucleotide Motifs , Schizosaccharomyces/genetics , TATA Box , Transcription Initiation Site

Non-random DNA fragmentation in next-generation sequencing.

Poptsova, Maria S; Il'icheva, Irina A; Nechipurenko, Dmitry Yu; Panchenko, Larisa A; Khodikov, Mingian V; Oparina, Nina Y; Polozov, Robert V; Nechipurenko, Yury D; Grokhovsky, Sergei L.

Sci Rep ; 4: 4532, 2014 Mar 31.

Article in English | MEDLINE | ID: mdl-24681819

ABSTRACT

Next Generation Sequencing (NGS) technology is based on cutting DNA into small fragments, and their massive parallel sequencing. The multiple overlapping segments termed "reads" are assembled into a contiguous sequence. To reduce sequencing errors, every genome region should be sequenced several dozen times. This sequencing approach is based on the assumption that genomic DNA breaks are random and sequence-independent. However, previously we showed that for the sonicated restriction DNA fragments the rates of double-stranded breaks depend on the nucleotide sequence. In this work we analyzed genomic reads from NGS data and discovered that fragmentation methods based on the action of the hydrodynamic forces on DNA, produce similar bias. Consideration of this non-random DNA fragmentation may allow one to unravel what factors and to what extent influence the non-uniform coverage of various genomic regions.

Using comparative genome analysis to identify problems in annotated microbial genomes.

Poptsova, Maria S; Gogarten, J Peter.

Microbiology (Reading) ; 156(Pt 7): 1909-1917, 2010 Jul.

Article in English | MEDLINE | ID: mdl-20430813

ABSTRACT

Genome annotation is a tedious task that is mostly done by automated methods; however, the accuracy of these approaches has been questioned since the beginning of the sequencing era. Genome annotation is a multilevel process, and errors can emerge at different stages: during sequencing, as a result of gene-calling procedures, and in the process of assigning gene functions. Missed or wrongly annotated genes differentially impact different types of analyses. Here we discuss and demonstrate how the methods of comparative genome analysis can refine annotations by locating missing orthologues. We also discuss possible reasons for errors and show that the second-generation annotation systems, which combine multiple gene-calling programs with similarity-based methods, perform much better than the first annotation tools. Since old errors may propagate to the newly sequenced genomes, we emphasize that the problem of continuously updating popular public databases is an urgent and unresolved one. Due to the progress in genome-sequencing technologies, automated annotation techniques will remain the main approach in the future. Researchers need to be aware of the existing errors in the annotation of even well-studied genomes, such as Escherichia coli, and consider additional quality control for their results.

Subject(s)

Bacteria/genetics , Computational Biology/standards , Fungi/genetics , Genome, Bacterial , Genome, Fungal , Bacteria/chemistry , Computational Biology/methods , Fungi/chemistry , Software

Hidden chromosome symmetry: in silico transformation reveals symmetry in 2D DNA walk trajectories of 671 chromosomes.

Poptsova, Maria S; Larionov, Sergei A; Ryadchenko, Eugeny V; Rybalko, Sergei D; Zakharov, Ilya A; Loskutov, Alexander.

PLoS One ; 4(7): e6396, 2009 Jul 28.

Article in English | MEDLINE | ID: mdl-19636424

ABSTRACT

Maps of 2D DNA walk of 671 examined chromosomes show composition complexity change from symmetrical half-turn in bacteria to pseudo-random trajectories in archaea, fungi and humans. In silico transformation of gene order and strand position returns most of the analyzed chromosomes to a symmetrical bacterial-like state with one transition point. The transformed chromosomal sequences also reveal remarkable segmental compositional symmetry between regions from different strands located equidistantly from the transition point. Despite extensive chromosome rearrangement the relation of gene numbers on opposite strands for chromosomes of different taxa varies in narrow limits around unity with Pearson coefficient r = 0.98. Similar relation is observed for total genes' length (r = 0.86) and cumulative GC (r = 0.95) and AT (r = 0.97) skews. This is also true for human coding sequences (CDS), which comprise only several percent of the entire chromosome length. We found that frequency distributions of the length of gene clusters, continuously located on the same strand, have close values for both strands. Eukaryotic gene distribution is believed to be non-random. Contribution of different subsystems to the noted symmetries and distributions, and evolutionary aspects of symmetry are discussed.

Subject(s)

Chromosomes, Human , DNA/genetics , Animals , Genetic Vectors , Humans

BranchClust: a phylogenetic algorithm for selecting gene families.

Poptsova, Maria S; Gogarten, J Peter.

BMC Bioinformatics ; 8: 120, 2007 Apr 10.

Article in English | MEDLINE | ID: mdl-17425803

ABSTRACT

BACKGROUND: Automated methods for assembling families of orthologous genes include those based on sequence similarity scores and those based on phylogenetic approaches. The first are easy to automate but usually they do not distinguish between paralogs and orthologs or have restriction on the number of taxa. Phylogenetic methods often are based on reconciliation of a gene tree with a known rooted species tree; a limitation of this approach, especially in case of prokaryotes, is that the species tree is often unknown, and that from the analyses of single gene families the branching order between related organisms frequently is unresolved. RESULTS: Here we describe an algorithm for the automated selection of orthologous genes that recognizes orthologous genes from different species in a phylogenetic tree for any number of taxa. The algorithm is capable of distinguishing complete (containing all taxa) and incomplete (not containing all taxa) families and recognizes in- and outparalogs. The BranchClust algorithm is implemented in Perl with the use of the BioPerl module for parsing trees and is freely available at http://bioinformatics.org/branchclust. CONCLUSION: BranchClust outperforms the Reciprocal Best Blast hit method in selecting more sets of putatively orthologous genes. In the test cases examined, the correctness of the selected families and of the identified in- and outparalogs was confirmed by inspection of the pertinent phylogenetic trees.

Subject(s)

Algorithms , Evolution, Molecular , Multigene Family/genetics , Phylogeny , Proteome/genetics , Sequence Analysis, DNA/methods , Software , Base Sequence , Cluster Analysis , Conserved Sequence , Molecular Sequence Data , Sequence Alignment/methods , Sequence Homology, Amino Acid

The power of phylogenetic approaches to detect horizontally transferred genes.

Poptsova, Maria S; Gogarten, J Peter.

BMC Evol Biol ; 7: 45, 2007 Mar 21.

Article in English | MEDLINE | ID: mdl-17376230

ABSTRACT

BACKGROUND: Horizontal gene transfer plays an important role in evolution because it sometimes allows recipient lineages to adapt to new ecological niches. High genes transfer frequencies were inferred for prokaryotic and early eukaryotic evolution. Does horizontal gene transfer also impact phylogenetic reconstruction of the evolutionary history of genomes and organisms? The answer to this question depends at least in part on the actual gene transfer frequencies and on the ability to weed out transferred genes from further analyses. Are the detected transfers mainly false positives, or are they the tip of an iceberg of many transfer events most of which go undetected by current methods? RESULTS: Phylogenetic detection methods appear to be the method of choice to infer gene transfers, especially for ancient transfers and those followed by orthologous replacement. Here we explore how well some of these methods perform using in silico transfers between the terminal branches of a gamma proteobacterial, genome based phylogeny. For the experiments performed here on average the AU test at a 5% significance level detects 90.3% of the transfers and 91% of the exchanges as significant. Using the Robinson-Foulds distance only 57.7% of the exchanges and 60% of the donations were identified as significant. Analyses using bipartition spectra appeared most successful in our test case. The power of detection was on average 97% using a 70% cut-off and 94.2% with 90% cut-off for identifying conflicting bipartitions, while the rate of false positives was below 4.2% and 2.1% for the two cut-offs, respectively. For all methods the detection rates improved when more intervening branches separated donor and recipient. CONCLUSION: Rates of detected transfers should not be mistaken for the actual transfer rates; most analyses of gene transfers remain anecdotal. The method and significance level to identify potential gene transfer events represent a trade-off between the frequency of erroneous identification (false positives) and the power to detect actual transfer events.

Subject(s)

Classification/methods , Gammaproteobacteria/genetics , Gene Transfer, Horizontal/genetics , Genome, Bacterial , Multigene Family/genetics , Phylogeny , Cluster Analysis , Computational Biology , Likelihood Functions , Models, Genetic

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL