Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 22
Filter
Add more filters










Publication year range
1.
Int J Mol Sci ; 24(13)2023 Jun 27.
Article in English | MEDLINE | ID: mdl-37445918

ABSTRACT

The dynamic processes operating on genomic DNA, such as gene expression and cellular division, lead inexorably to topological challenges in the form of entanglements, catenanes, knots, "bubbles", R-loops, and other outcomes of supercoiling and helical disruption. The resolution of toxic topological stress is the function attributed to DNA topoisomerases. A prominent example is the negative supercoiling (nsc) trailing processive enzymes such as DNA and RNA polymerases. The multiple equilibrium states that nscDNA can adopt by redistribution of helical twist and writhe include the left-handed double-helical conformation known as Z-DNA. Thirty years ago, one of our labs isolated a protein from Drosophila cells and embryos with a 100-fold greater affinity for Z-DNA than for B-DNA, and identified it as topoisomerase II (gene Top2, orthologous to the human UniProt proteins TOP2A and TOP2B). GTP increased the affinity and selectivity for Z-DNA even further and also led to inhibition of the isomerase enzymatic activity. An allosteric mechanism was proposed, in which topoII acts as a Z-DNA-binding protein (ZBP) to stabilize given states of topological (sub)domains and associated multiprotein complexes. We have now explored this possibility by comprehensive bioinformatic analyses of the available protein sequences of topoII representing organisms covering the whole tree of life. Multiple alignment of these sequences revealed an extremely high level of evolutionary conservation, including a winged-helix protein segment, here denoted as Zτ, constituting the putative structural homolog of Zα, the canonical Z-DNA/Z-RNA binding domain previously identified in the interferon-inducible RNA Adenosine-to-Inosine-editing deaminase, ADAR1p150. In contrast to Zα, which is separate from the protein segment responsible for catalysis, Zτ encompasses the active site tyrosine of topoII; a GTP-binding site and a GxxG sequence motif are in close proximity. Quantitative Zτ-Zα similarity comparisons and molecular docking with interaction scoring further supported the "B-Z-topoII hypothesis" and has led to an expanded mechanism for topoII function incorporating the recognition of Z-DNA segments ("Z-flipons") as an inherent and essential element. We further propose that the two Zτ domains of the topoII homodimer exhibit a single-turnover "conformase" activity on given G(ate) B-DNA segments ("Z-flipins"), inducing their transition to the left-handed Z-conformation. Inasmuch as the topoII-Z-DNA complexes are isomerase inactive, we infer that they fulfill important structural roles in key processes such as mitosis. Topoisomerases are preeminent targets of anti-cancer drug discovery, and we anticipate that detailed elucidation of their structural-functional interactions with Z-DNA and GTP will facilitate the design of novel, more potent and selective anti-cancer chemotherapeutic agents.


Subject(s)
DNA, B-Form , DNA, Z-Form , Humans , Molecular Docking Simulation , DNA/chemistry , DNA Topoisomerases, Type II/genetics , DNA Topoisomerases, Type II/metabolism , Guanosine Triphosphate , Adenosine Deaminase/metabolism
2.
BMC Res Notes ; 16(1): 109, 2023 Jun 20.
Article in English | MEDLINE | ID: mdl-37340477

ABSTRACT

OBJECTIVE: Chalcone synthase (CHS) catalyzes the initial step of the flavonoid biosynthesis. The CHS encoding gene is well studied in numerous plant species. Rapidly growing sequence databases contain hundreds of CHS entries that are the result of automatic annotation. In this study, we evaluated apparent multiplication of CHS domains in CHS gene models of four plant species. MAIN FINDINGS: CHS genes with an apparent triplication of the CHS domain encoding part were discovered through database searches. Such genes were found in Macadamia integrifolia, Musa balbisiana, Musa troglodytarum, and Nymphaea colorata. A manual inspection of the CHS gene models in these four species with massive RNA-seq data suggests that these gene models are the result of artificial fusions in the annotation process. While there are hundreds of seemingly correct CHS records in the databases, it is not clear why these annotation artifacts appeared.


Subject(s)
Acyltransferases , Artifacts , Acyltransferases/genetics , Plants
3.
Methods Mol Biol ; 2642: 331-361, 2023.
Article in English | MEDLINE | ID: mdl-36944887

ABSTRACT

Epigenetics deals with changes in gene expression that are not caused by modifications in the primary sequence of nucleic acids. These changes beyond primary structures of nucleic acids not only include DNA/RNA methylation, but also other reversible conversions, together with histone modifications or RNA interference. In addition, under particular conditions (such as specific ion concentrations or protein-induced stabilization), the right-handed double-stranded DNA helix (B-DNA) can form noncanonical structures commonly described as "non-B DNA" structures. These structures comprise, for example, cruciforms, i-motifs, triplexes, and G-quadruplexes. Their formation often leads to significant differences in replication and transcription rates. Noncanonical RNA structures have also been documented to play important roles in translation regulation and the biology of noncoding RNAs. In human and animal studies, the frequency and dynamics of noncanonical DNA and RNA structures are intensively investigated, especially in the field of cancer research and neurodegenerative diseases. In contrast, noncanonical DNA and RNA structures in plants have been on the fringes of interest for a long time and only a few studies deal with their formation, regulation, and physiological importance for plant stress responses. Herein, we present a review focused on the main fields of epigenetics in plants and their possible roles in stress responses and signaling, with special attention dedicated to noncanonical DNA and RNA structures.


Subject(s)
G-Quadruplexes , Nucleic Acids , Animals , Humans , DNA/genetics , DNA/chemistry , Epigenesis, Genetic , RNA/genetics , RNA/chemistry , Plants/genetics
4.
Plants (Basel) ; 12(5)2023 Feb 28.
Article in English | MEDLINE | ID: mdl-36903937

ABSTRACT

The opium poppy's ability to produce various alkaloids is both useful and problematic. Breeding of new varieties with varying alkaloid content is therefore an important task. In this paper, the breeding technology of new low morphine poppy genotypes, based on a combination of a TILLING approach and single-molecule real-time NGS sequencing, is presented. Verification of the mutants in the TILLING population was obtained using RT-PCR and HPLC methods. Only three of the single-copy genes of the morphine pathway among the eleven genes were used for the identification of mutant genotypes. Point mutations were obtained only in one gene (CNMT) while an insertion was obtained in the other (SalAT). Only a few expected transition SNPs from G:C to A:T were obtained. In the low morphine mutant genotype, the production of morphine was decreased to 0.1% from 1.4% in the original variety. A comprehensive description of the breeding process, a basic characterization of the main alkaloid content, and a gene expression profile for the main alkaloid-producing genes is provided. Difficulties with the TILLING approach are also described and discussed.

5.
Int J Mol Sci ; 23(23)2022 Nov 25.
Article in English | MEDLINE | ID: mdl-36499082

ABSTRACT

Plant miRNAs are powerful regulators of gene expression at the post-transcriptional level, which was repeatedly proved in several model plant species. miRNAs are considered to be key regulators of many developmental, homeostatic, and immune processes in plants. However, our understanding of plant miRNAs is still limited, despite the fact that an increasing number of studies have appeared. This systematic review aims to summarize our current knowledge about miRNAs in spring barley (Hordeum vulgare), which is an important agronomical crop worldwide and serves as a common monocot model for studying abiotic stress responses as well. This can help us to understand the connection between plant miRNAs and (not only) abiotic stresses in general. In the end, some future perspectives and open questions are summarized.


Subject(s)
Hordeum , MicroRNAs , Hordeum/genetics , Hordeum/metabolism , MicroRNAs/genetics , MicroRNAs/metabolism , Stress, Physiological/genetics , Plants/metabolism , Gene Expression Regulation, Plant
6.
Int J Mol Sci ; 23(12)2022 Jun 10.
Article in English | MEDLINE | ID: mdl-35742975

ABSTRACT

Photosynthetically active radiation (PAR) is an important environmental cue inducing the production of many secondary metabolites involved in plant oxidative stress avoidance and tolerance. To examine the complex role of PAR irradiance and specific spectral components on the accumulation of phenolic compounds (PheCs), we acclimated spring barley (Hordeum vulgare) to different spectral qualities (white, blue, green, red) at three irradiances (100, 200, 400 µmol m-2 s-1). We confirmed that blue light irradiance is essential for the accumulation of PheCs in secondary barley leaves (in UV-lacking conditions), which underpins the importance of photoreceptor signals (especially cryptochrome). Increasing blue light irradiance most effectively induced the accumulation of B-dihydroxylated flavonoids, probably due to the significantly enhanced expression of the F3'H gene. These changes in PheC metabolism led to a steeper increase in antioxidant activity than epidermal UV-A shielding in leaf extracts containing PheCs. In addition, we examined the possible role of miRNAs in the complex regulation of gene expression related to PheC biosynthesis.


Subject(s)
Hordeum , Ultraviolet Rays , Flavonoids/metabolism , Hordeum/genetics , Hordeum/metabolism , Light , Phenols/metabolism , Plant Leaves/genetics , Plant Leaves/metabolism
7.
Brief Bioinform ; 23(3)2022 05 13.
Article in English | MEDLINE | ID: mdl-35229157

ABSTRACT

SARS-CoV-2 is a novel positive-sense single-stranded RNA virus from the Coronaviridae family (genus Betacoronavirus), which has been established as causing the COVID-19 pandemic. The genome of SARS-CoV-2 is one of the largest among known RNA viruses, comprising of at least 26 known protein-coding loci. Studies thus far have outlined the coding capacity of the positive-sense strand of the SARS-CoV-2 genome, which can be used directly for protein translation. However, it has been recently shown that transcribed negative-sense viral RNA intermediates that arise during viral genome replication from positive-sense viruses can also code for proteins. No studies have yet explored the potential for negative-sense SARS-CoV-2 RNA intermediates to contain protein-coding loci. Thus, using sequence and structure-based bioinformatics methodologies, we have investigated the presence and validity of putative negative-sense ORFs (nsORFs) in the SARS-CoV-2 genome. Nine nsORFs were discovered to contain strong eukaryotic translation initiation signals and high codon adaptability scores, and several of the nsORFs were predicted to interact with RNA-binding proteins. Evolutionary conservation analyses indicated that some of the nsORFs are deeply conserved among related coronaviruses. Three-dimensional protein modeling revealed the presence of higher order folding among all putative SARS-CoV-2 nsORFs, and subsequent structural mimicry analyses suggest similarity of the nsORFs to DNA/RNA-binding proteins and proteins involved in immune signaling pathways. Altogether, these results suggest the potential existence of still undescribed SARS-CoV-2 proteins, which may play an important role in the viral lifecycle and COVID-19 pathogenesis.


Subject(s)
COVID-19 , SARS-CoV-2 , COVID-19/genetics , Genome, Viral , Humans , Pandemics , RNA, Viral/chemistry , RNA, Viral/genetics , RNA-Binding Proteins/genetics , SARS-CoV-2/genetics
8.
Int J Mol Sci ; 23(2)2022 Jan 11.
Article in English | MEDLINE | ID: mdl-35054954

ABSTRACT

Z-DNA and Z-RNA are functionally important left-handed structures of nucleic acids, which play a significant role in several molecular and biological processes including DNA replication, gene expression regulation and viral nucleic acid sensing. Most proteins that have been proven to interact with Z-DNA/Z-RNA contain the so-called Zα domain, which is structurally well conserved. To date, only eight proteins with Zα domain have been described within a few organisms (including human, mouse, Danio rerio, Trypanosoma brucei and some viruses). Therefore, this paper aimed to search for new Z-DNA/Z-RNA binding proteins in the complete PDB structures database and from the AlphaFold2 protein models. A structure-based similarity search found 14 proteins with highly similar Zα domain structure in experimentally-defined proteins and 185 proteins with a putative Zα domain using the AlphaFold2 models. Structure-based alignment and molecular docking confirmed high functional conservation of amino acids involved in Z-DNA/Z-RNA, suggesting that Z-DNA/Z-RNA recognition may play an important role in a variety of cellular processes.


Subject(s)
DNA, Z-Form/chemistry , DNA-Binding Proteins/chemistry , Models, Molecular , Protein Interaction Domains and Motifs , RNA-Binding Proteins/chemistry , RNA/chemistry , Amino Acid Sequence , Binding Sites , DNA, Z-Form/metabolism , DNA-Binding Proteins/metabolism , Molecular Docking Simulation , Molecular Dynamics Simulation , Nucleic Acid Conformation , Protein Binding , Protein Conformation , RNA/metabolism , RNA-Binding Proteins/metabolism , Structure-Activity Relationship
9.
Plants (Basel) ; 10(9)2021 Sep 10.
Article in English | MEDLINE | ID: mdl-34579414

ABSTRACT

Water deficiency is one of the most significant abiotic stresses that negatively affects growth and reduces crop yields worldwide. Most research is focused on model plants and/or crops which are most agriculturally important. In this research, drought stress was applied to two drought stress contrasting varieties of Papaver somniferum (the opium poppy), a non-model plant species, during the first week of its germination, which differ in responses to drought stress. After sowing, the poppy seedlings were immediately subjected to drought stress for 7 days. We conducted a large-scale transcriptomic and proteomic analysis for drought stress response. At first, we found that the transcriptomic and proteomic profiles significantly differ. However, the most significant findings are the identification of key genes and proteins with significantly different expressions relating to drought stress, e.g., the heat-shock protein family, dehydration responsive element-binding transcription factors, ubiquitin E3 ligase, and others. In addition, metabolic pathway analysis showed that these genes and proteins were part of several biosynthetic pathways most significantly related to photosynthetic processes, and oxidative stress responses. A future study will focus on a detailed analysis of key genes and the development of selection markers for the determination of drought-resistant varieties and the breeding of new resistant lineages.

10.
Int J Mol Sci ; 22(16)2021 Aug 07.
Article in English | MEDLINE | ID: mdl-34445220

ABSTRACT

Recently, the quest for the mythical fountain of youth has produced extensive research programs that aim to extend the healthy lifespan of humans. Despite advances in our understanding of the aging process, the surprisingly extended lifespan and cancer resistance of some animal species remain unexplained. The p53 protein plays a crucial role in tumor suppression, tissue homeostasis, and aging. Long-lived, cancer-free African elephants have 20 copies of the TP53 gene, including 19 retrogenes (38 alleles), which are partially active, whereas humans possess only one copy of TP53 and have an estimated cancer mortality rate of 11-25%. The mechanism through which p53 contributes to the resolution of the Peto's paradox in Animalia remains vague. Thus, in this work, we took advantage of the available datasets and inspected the p53 amino acid sequence of phylogenetically related organisms that show variations in their lifespans. We discovered new correlations between specific amino acid deviations in p53 and the lifespans across different animal species. We found that species with extended lifespans have certain characteristic amino acid substitutions in the p53 DNA-binding domain that alter its function, as depicted from the Phenotypic Annotation of p53 Mutations, using the PROVEAN tool or SWISS-MODEL workflow. In addition, the loop 2 region of the human p53 DNA-binding domain was identified as the longest region that was associated with longevity. The 3D model revealed variations in the loop 2 structure in long-lived species when compared with human p53. Our findings show a direct association between specific amino acid residues in p53 protein, changes in p53 functionality, and the extended animal lifespan, and further highlight the importance of p53 protein in aging.


Subject(s)
Databases, Genetic , Gene Dosage , Longevity , Models, Molecular , Animals , Protein Domains , Protein Structure, Secondary , Species Specificity , Tumor Suppressor Protein p53/chemistry , Tumor Suppressor Protein p53/genetics , Tumor Suppressor Protein p53/metabolism
11.
Int J Mol Sci ; 22(14)2021 Jul 09.
Article in English | MEDLINE | ID: mdl-34299001

ABSTRACT

G-quadruplexes have long been perceived as rare and physiologically unimportant nucleic acid structures. However, several studies have revealed their importance in molecular processes, suggesting their possible role in replication and gene expression regulation. Pathways involving G-quadruplexes are intensively studied, especially in the context of human diseases, while their involvement in gene expression regulation in plants remains largely unexplored. Here, we conducted a bioinformatic study and performed a complex circular dichroism measurement to identify a stable G-quadruplex in the gene RPB1, coding for the RNA polymerase II large subunit. We found that this G-quadruplex-forming locus is highly evolutionarily conserved amongst plants sensu lato (Archaeplastida) that share a common ancestor more than one billion years old. Finally, we discussed a new hypothesis regarding G-quadruplexes interacting with UV light in plants to potentially form an additional layer of the regulatory network.


Subject(s)
G-Quadruplexes , Plant Proteins/chemistry , Plants/chemistry , RNA Polymerase II/chemistry , Amino Acid Sequence , Arabidopsis/chemistry , Arabidopsis/genetics , Arabidopsis/radiation effects , Circular Dichroism , Computational Biology , Evolution, Molecular , G-Quadruplexes/radiation effects , Gene Expression Regulation, Plant/genetics , Glaucophyta/chemistry , Glaucophyta/genetics , Glaucophyta/radiation effects , Phylogeny , Plant Proteins/genetics , Plant Proteins/radiation effects , Plants/genetics , Plants/radiation effects , RNA Polymerase II/genetics , Rhodophyta/chemistry , Rhodophyta/genetics , Rhodophyta/radiation effects , Sequence Alignment , Ultraviolet Rays
12.
Brief Bioinform ; 22(5)2021 09 02.
Article in English | MEDLINE | ID: mdl-33837760

ABSTRACT

In a recently published paper, we have found that SARS-CoV-2 hot-spot mutations are significantly associated with inverted repeat loci and CG dinucleotides. However, fast-spreading strains with new mutations (so-called mink farm mutations, England mutations and Japan mutations) have been recently described. We used the new datasets to check the positioning of mutation sites in genomes of the new SARS-CoV-2 strains. Using an open-access Palindrome analyzer tool, we found mutations in these new strains to be significantly enriched in inverted repeat loci.


Subject(s)
Mutation , SARS-CoV-2/genetics , COVID-19/virology , Genome, Viral , Humans
13.
Int J Mol Sci ; 22(2)2021 Jan 18.
Article in English | MEDLINE | ID: mdl-33477647

ABSTRACT

Nucleic acid-binding proteins are traditionally divided into two categories: With the ability to bind DNA or RNA. In the light of new knowledge, such categorizing should be overcome because a large proportion of proteins can bind both DNA and RNA. Another even more important features of nucleic acid-binding proteins are so-called sequence or structure specificities. Proteins able to bind nucleic acids in a sequence-specific manner usually contain one or more of the well-defined structural motifs (zinc-fingers, leucine zipper, helix-turn-helix, or helix-loop-helix). In contrast, many proteins do not recognize nucleic acid sequence but rather local DNA or RNA structures (G-quadruplexes, i-motifs, triplexes, cruciforms, left-handed DNA/RNA form, and others). Finally, there are also proteins recognizing both sequence and local structural properties of nucleic acids (e.g., famous tumor suppressor p53). In this mini-review, we aim to summarize current knowledge about the amino acid composition of various types of nucleic acid-binding proteins with a special focus on significant enrichment and/or depletion in each category.


Subject(s)
DNA-Binding Proteins/genetics , DNA/ultrastructure , Nucleic Acid Conformation , RNA/ultrastructure , Amino Acid Sequence/genetics , Carrier Proteins/genetics , Carrier Proteins/ultrastructure , DNA/genetics , DNA, Z-Form , G-Quadruplexes , Humans , Leucine Zippers/genetics , Nucleoproteins/genetics , Nucleoproteins/ultrastructure , RNA/chemistry , Zinc Fingers/genetics
14.
BioTech (Basel) ; 10(4)2021 Sep 22.
Article in English | MEDLINE | ID: mdl-35822794

ABSTRACT

G-quadruplexes are four-stranded nucleic acid structures occurring in the genomes of all living organisms and viruses. It is increasingly evident that these structures play important molecular roles; generally, by modulating gene expression and overall genome integrity. For a long period, G-quadruplexes have been studied specifically in the context of human promoters, telomeres, and associated diseases (cancers, neurological disorders). Several of the proteins for binding G-quadruplexes are known, providing promising targets for influencing G-quadruplex-related processes in organisms. Nonetheless, in plants, only a small number of G-quadruplex binding proteins have been described to date. Thus, we aimed to bioinformatically inspect the available protein sequences to find the best protein candidates with the potential to bind G-quadruplexes. Two similar glycine and arginine-rich G-quadruplex-binding motifs were described in humans. The first is the so-called "RGG motif"-RRGDGRRRGGGGRGQGGRGRGGGFKG, and the second (which has been recently described) is known as the "NIQI motif"-RGRGRGRGGGSGGSGGRGRG. Using this general knowledge, we searched for plant proteins containing the above mentioned motifs, using two independent approaches (BLASTp and FIMO scanning), and revealed many proteins containing the G4-binding motif(s). Our research also revealed the core proteins involved in G4 folding and resolving in green plants, algae, and the key plant model organism, Arabidopsis thaliana. The discovered protein candidates were annotated using STRINGdb and sorted by their molecular and physiological roles in simple schemes. Our results point to the significant role of G4-binding proteins in the regulation of gene expression in plants.

15.
Brief Bioinform ; 22(2): 1338-1345, 2021 03 22.
Article in English | MEDLINE | ID: mdl-33341900

ABSTRACT

SARS-CoV-2 is an intensively investigated virus from the order Nidovirales (Coronaviridae family) that causes COVID-19 disease in humans. Through enormous scientific effort, thousands of viral strains have been sequenced to date, thereby creating a strong background for deep bioinformatics studies of the SARS-CoV-2 genome. In this study, we inspected high-frequency mutations of SARS-CoV-2 and carried out systematic analyses of their overlay with inverted repeat (IR) loci and CpG islands. The main conclusion of our study is that SARS-CoV-2 hot-spot mutations are significantly enriched within both IRs and CpG island loci. This points to their role in genomic instability and may predict further mutational drive of the SARS-CoV-2 genome. Moreover, CpG islands are strongly enriched upstream from viral ORFs and thus could play important roles in transcription and the viral life cycle. We hypothesize that hypermethylation of these loci will decrease the transcription of viral ORFs and could therefore limit the progression of the disease.


Subject(s)
COVID-19/virology , CpG Islands , Mutation , SARS-CoV-2/genetics , DNA Methylation , Genome, Viral , Humans , Protein Binding
16.
Front Microbiol ; 11: 1583, 2020.
Article in English | MEDLINE | ID: mdl-32719673

ABSTRACT

Non-canonical nucleic acid structures play important roles in the regulation of molecular processes. Considering the importance of the ongoing coronavirus crisis, we decided to evaluate genomes of all coronaviruses sequenced to date (stated more broadly, the order Nidovirales) to determine if they contain non-canonical nucleic acid structures. We discovered much evidence of putative G-quadruplex sites and even much more of inverted repeats (IRs) loci, which in fact are ubiquitous along the whole genomic sequence and indicate a possible mechanism for genomic RNA packaging. The most notable enrichment of IRs was found inside 5'UTR for IRs of size 12+ nucleotides, and the most notable enrichment of putative quadruplex sites (PQSs) was located before 3'UTR, inside 5'UTR, and before mRNA. This indicates crucial regulatory roles for both IRs and PQSs. Moreover, we found multiple G-quadruplex binding motifs in human proteins having potential for binding of SARS-CoV-2 RNA. Non-canonical nucleic acids structures in Nidovirales and in novel SARS-CoV-2 are therefore promising druggable structures that can be targeted and utilized in the future.

17.
Int J Mol Sci ; 21(1)2019 Dec 18.
Article in English | MEDLINE | ID: mdl-31861340

ABSTRACT

The p53 family of transcription factors plays key roles in development, genome stability, senescence and tumor development, and p53 is the most important tumor suppressor protein in humans. Although intensively investigated for many years, its initial evolutionary history is not yet fully elucidated. Using bioinformatic and structure prediction methods on current databases containing newly-sequenced genomes and transcriptomes, we present a detailed characterization of p53 family homologs in remote members of the Holozoa group, in the unicellular clades Filasterea, Ichthyosporea and Corallochytrea. Moreover, we show that these newly characterized homologous sequences contain domains that can form structures with high similarity to the human p53 family DNA-binding domain, and some also show similarities to the oligomerization and SAM domains. The presence of these remote homologs demonstrates an ancient origin of the p53 protein family.


Subject(s)
Eukaryota/genetics , Evolution, Molecular , Multigene Family , Sequence Homology, Amino Acid , Tumor Suppressor Protein p53/chemistry , Tumor Suppressor Protein p53/genetics , Amino Acid Sequence , Databases, Genetic , Eukaryota/classification , Exons , Introns , Models, Molecular , Phylogeny , Protein Conformation , Protein Interaction Domains and Motifs , Tumor Suppressor Protein p53/metabolism
18.
Molecules ; 24(9)2019 May 02.
Article in English | MEDLINE | ID: mdl-31052562

ABSTRACT

The role of local DNA structures in the regulation of basic cellular processes is an emerging field of research. Amongst local non-B DNA structures, the significance of G-quadruplexes was demonstrated in the last decade, and their presence and functional relevance has been demonstrated in many genomes, including humans. In this study, we analyzed the presence and locations of G-quadruplex-forming sequences by G4Hunter in all complete bacterial genomes available in the NCBI database. G-quadruplex-forming sequences were identified in all species, however the frequency differed significantly across evolutionary groups. The highest frequency of G-quadruplex forming sequences was detected in the subgroup Deinococcus-Thermus, and the lowest frequency in Thermotogae. G-quadruplex forming sequences are non-randomly distributed and are favored in various evolutionary groups. G-quadruplex-forming sequences are enriched in ncRNA segments followed by mRNAs. Analyses of surrounding sequences showed G-quadruplex-forming sequences around tRNA and regulatory sequences. These data point to the unique and non-random localization of G-quadruplex-forming sequences in bacterial genomes.


Subject(s)
Bacteria/genetics , DNA, Bacterial/chemistry , G-Quadruplexes , Genome, Bacterial , Humans , Nucleic Acid Conformation , Phylogeny
19.
Molecules ; 23(9)2018 Sep 13.
Article in English | MEDLINE | ID: mdl-30216987

ABSTRACT

The importance of local DNA structures in the regulation of basic cellular processes is an emerging field of research. Amongst local non-B DNA structures, G-quadruplexes are perhaps the most well-characterized to date, and their presence has been demonstrated in many genomes, including that of humans. G-quadruplexes are selectively bound by many regulatory proteins. In this paper, we have analyzed the amino acid composition of all seventy-seven described G-quadruplex binding proteins of Homo sapiens. Our comparison with amino acid frequencies in all human proteins and specific protein subsets (e.g., all nucleic acid binding) revealed unique features of quadruplex binding proteins, with prominent enrichment for glycine (G) and arginine (R). Cluster analysis with bootstrap resampling shows similarities and differences in amino acid composition of particular quadruplex binding proteins. Interestingly, we found that all characterized G-quadruplex binding proteins share a 20 amino acid long motif/domain (RGRGR GRGGG SGGSG GRGRG) which is similar to the previously described RG-rich domain (RRGDG RRRGG GGRGQ GGRGR GGGFKG) of the FRM1 G-quadruplex binding protein. Based on this protein fingerprint, we have predicted a new set of potential G-quadruplex binding proteins sharing this interesting domain rich in glycine and arginine residues.


Subject(s)
DNA-Binding Proteins/chemistry , DNA-Binding Proteins/metabolism , DNA/chemistry , Amino Acid Motifs , DNA/metabolism , G-Quadruplexes , Humans , Nucleic Acid Conformation , Protein Interaction Maps
20.
Biochimie ; 150: 70-75, 2018 Jul.
Article in English | MEDLINE | ID: mdl-29733879

ABSTRACT

Quadruplexes are noncanonical DNA structures that arise in guanine rich loci and have important biological functions. Classically, quadruplexes contain four stacked intramolecular G-tetrads. Surprisingly, although some algorithms allow searching for longer than 4G tracts for quadruplex formation, these have not yet been systematically studied. Therefore, we analyzed the human genome for sequences that are predicted to adopt stacked intramolecular G-tetrads with greater than four stacks. The data provide evidence for numerous G-quadruplexes that contain five or six stacked intramolecular G-tetrads. These sequences are predominantly found in known gene regulatory regions. Electrophoretic mobility assays and circular dichroism spectroscopy indicate that these sequences form quadruplex structures in vitro under physiological conditions. The localization and in vitro stability of these G-quadruplexes indicate their potentially important roles in gene regulation and their potential for therapeutic applications.


Subject(s)
Computational Biology/methods , G-Quadruplexes , Circular Dichroism , Nucleic Acid Conformation
SELECTION OF CITATIONS
SEARCH DETAIL
...