Search | VHL Regional Portal

Structure, clustering and functional insights of repeats configurations in the upstream promoter region of the human coding genes.

Tobar-Tosse, Fabian; Veléz, Patricia E; Ocampo-Toro, Eliana; Moreno, Pedro A.

BMC Genomics ; 19(Suppl 8): 862, 2018 Dec 11.

Article in English | MEDLINE | ID: mdl-30537933

ABSTRACT

BACKGROUND: Repetitive DNA sequences (Repeats) are significant regions in the human genome that have a specific genomic distribution, structure, and several binding sites for genome architecture and function. In consequence, the possible configurations of Repeats in specific and dynamic regions like the gene promoters could define footprints for molecular mechanisms, pathways, and cell function beyond their density in the genome. Here we explored the distribution of Repeats in the upstream promoter region of the human coding genes with the aim to identify specific configurations, clusters and functional meaning of those elements. Our method includes structural descriptions, hierarchical clustering, pathway association, and functional enrichment analysis. RESULTS: We report here several configurations of Repeats in the upstream promoter region (UPR), which define 2729 patterns for the 80% of the human coding genes. There are 47 types of Repeats in these configurations, where the most frequent were Alu, Low_complexity, MIR, Simple_repeat, LINE/L2, LINE/L1, hAT-Charlie, and ERV1. The distribution, length, and the high frequency of Repeats in the UPR defines several patterns and clusters, where the minimum frequency of configuration among Repeats was higher than 0.7. We found those clusters associated with cellular pathways and ontologies; thus, it was plausible to determine groups of Repeats to specific functional insights, for example, pathways for Genetic Information Processing or Metabolism shows particular groups of Repeats with specific configurations. CONCLUSION: Based on these findings, we propose that specific configurations of repetitive elements describe frequent patterns in the upstream promoter for sets of human coding genes, which those correlated to specific and essential cell pathways and functions.

Subject(s)

Algorithms , Genome, Human , Open Reading Frames , Promoter Regions, Genetic , Repetitive Sequences, Nucleic Acid , Cluster Analysis , Gene Ontology , Humans

Exploration of noncoding sequences in metagenomes.

Tobar-Tosse, Fabián; Rodríguez, Adrián C; Vélez, Patricia E; Zambrano, María M; Moreno, Pedro A.

PLoS One ; 8(3): e59488, 2013.

Article in English | MEDLINE | ID: mdl-23536879

ABSTRACT

Environment-dependent genomic features have been defined for different metagenomes, whose genes and their associated processes are related to specific environments. Identification of ORFs and their functional categories are the most common methods for association between functional and environmental features. However, this analysis based on finding ORFs misses noncoding sequences and, therefore, some metagenome regulatory or structural information could be discarded. In this work we analyzed 23 whole metagenomes, including coding and noncoding sequences using the following sequence patterns: (G+C) content, Codon Usage (Cd), Trinucleotide Usage (Tn), and functional assignments for ORF prediction. Herein, we present evidence of a high proportion of noncoding sequences discarded in common similarity-based methods in metagenomics, and the kind of relevant information present in those. We found a high density of trinucleotide repeat sequences (TRS) in noncoding sequences, with a regulatory and adaptive function for metagenome communities. We present associations between trinucleotide values and gene function, where metagenome clustering correlate with microorganism adaptations and kinds of metagenomes. We propose here that noncoding sequences have relevant information to describe metagenomes that could be considered in a whole metagenome analysis in order to improve their organization, classification protocols, and their relation with the environment.

Subject(s)

Environmental Microbiology , Metagenome , Metagenomics , Cluster Analysis , Computational Biology/methods , Genome, Bacterial , Humans , Molecular Sequence Annotation , Open Reading Frames , RNA, Untranslated

The human genome: a multifractal analysis.

Moreno, Pedro A; Vélez, Patricia E; Martínez, Ember; Garreta, Luis E; Díaz, Néstor; Amador, Siler; Tischer, Irene; Gutiérrez, José M; Naik, Ashwinikumar K; Tobar, Fabián; García, Felipe.

BMC Genomics ; 12: 506, 2011 Oct 14.

Article in English | MEDLINE | ID: mdl-21999602

ABSTRACT

BACKGROUND: Several studies have shown that genomes can be studied via a multifractal formalism. Recently, we used a multifractal approach to study the genetic information content of the Caenorhabditis elegans genome. Here we investigate the possibility that the human genome shows a similar behavior to that observed in the nematode. RESULTS: We report here multifractality in the human genome sequence. This behavior correlates strongly on the presence of Alu elements and to a lesser extent on CpG islands and (G+C) content. In contrast, no or low relationship was found for LINE, MIR, MER, LTRs elements and DNA regions poor in genetic information. Gene function, cluster of orthologous genes, metabolic pathways, and exons tended to increase their frequencies with ranges of multifractality and large gene families were located in genomic regions with varied multifractality. Additionally, a multifractal map and classification for human chromosomes are proposed. CONCLUSIONS: Based on these findings, we propose a descriptive non-linear model for the structure of the human genome, with some biological implications. This model reveals 1) a multifractal regionalization where many regions coexist that are far from equilibrium and 2) this non-linear organization has significant molecular and medical genetic implications for understanding the role of Alu elements in genome stability and structure of the human genome. Given the role of Alu sequences in gene regulation, genetic diseases, human genetic diversity, adaptation and phylogenetic analyses, these quantifications are especially useful.

Subject(s)

Fractals , Genome, Human , Alu Elements , Base Composition , Chromosome Mapping , Chromosomes, Human/genetics , CpG Islands , Databases, Genetic , Discriminant Analysis , Humans , Models, Genetic , Multigene Family , Sequence Analysis, DNA

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL