Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
Add more filters










Database
Language
Publication year range
1.
Science ; 380(6645): eadd6142, 2023 05 12.
Article in English | MEDLINE | ID: mdl-37167382

ABSTRACT

Aridoamerica and Mesoamerica are two distinct cultural areas in northern and central Mexico, respectively, that hosted numerous pre-Hispanic civilizations between 2500 BCE and 1521 CE. The division between these regions shifted southward because of severe droughts ~1100 years ago, which allegedly drove a population replacement in central Mexico by Aridoamerican peoples. In this study, we present shotgun genome-wide data from 12 individuals and 27 mitochondrial genomes from eight pre-Hispanic archaeological sites across Mexico, including two at the shifting border of Aridoamerica and Mesoamerica. We find population continuity that spans the climate change episode and a broad preservation of the genetic structure across present-day Mexico for the past 2300 years. Lastly, we identify a contribution to pre-Hispanic populations of northern and central Mexico from two ancient unsampled "ghost" populations.


Subject(s)
Genetic Structures , Hispanic or Latino , Humans , History, Ancient , Mexico , Population Dynamics
2.
J Biomed Semantics ; 10(1): 8, 2019 05 22.
Article in English | MEDLINE | ID: mdl-31118102

ABSTRACT

BACKGROUND: The ability to express the same meaning in different ways is a well-known property of natural language. This amazing property is the source of major difficulties in natural language processing. Given the constant increase in published literature, its curation and information extraction would strongly benefit from efficient automatic processes, for which corpora of sentences evaluated by experts are a valuable resource. RESULTS: Given our interest in applying such approaches to the benefit of curation of the biomedical literature, specifically that about gene regulation in microbial organisms, we decided to build a corpus with graded textual similarity evaluated by curators and that was designed specifically oriented to our purposes. Based on the predefined statistical power of future analyses, we defined features of the design, including sampling, selection criteria, balance, and size, among others. A non-fully crossed study design was applied. Each pair of sentences was evaluated by 3 annotators from a total of 7; the scale used in the semantic similarity assessment task within the Semantic Evaluation workshop (SEMEVAL) was adapted to our goals in four successive iterative sessions with clear improvements in the agreed guidelines and interrater reliability results. Alternatives for such a corpus evaluation have been widely discussed. CONCLUSIONS: To the best of our knowledge, this is the first similarity corpus-a dataset of pairs of sentences for which human experts rate the semantic similarity of each pair-in this domain of knowledge. We have initiated its incorporation in our research towards high-throughput curation strategies based on natural language processing.


Subject(s)
Gene Expression Regulation , Microbiology , Natural Language Processing , Transcription, Genetic/genetics
3.
Nucleic Acids Res ; 47(D1): D212-D220, 2019 01 08.
Article in English | MEDLINE | ID: mdl-30395280

ABSTRACT

RegulonDB, first published 20 years ago, is a comprehensive electronic resource about regulation of transcription initiation of Escherichia coli K-12 with decades of knowledge from classic molecular biology experiments, and recently also from high-throughput genomic methodologies. We curated the literature to keep RegulonDB up to date, and initiated curation of ChIP and gSELEX experiments. We estimate that current knowledge describes between 10% and 30% of the expected total number of transcription factor- gene regulatory interactions in E. coli. RegulonDB provides datasets for interactions for which there is no evidence that they affect expression, as well as expression datasets. We developed a proof of concept pipeline to merge binding and expression evidence to identify regulatory interactions. These datasets can be visualized in the RegulonDB JBrowse. We developed the Microbial Conditions Ontology with a controlled vocabulary for the minimal properties to reproduce an experiment, which contributes to integrate data from high throughput and classic literature. At a higher level of integration, we report Genetic Sensory-Response Units for 200 transcription factors, including their regulation at the metabolic level, and include summaries for 70 of them. Finally, we summarize our research with Natural language processing strategies to enhance our biocuration work.


Subject(s)
Computational Biology/methods , Escherichia coli K12/genetics , Gene Expression Regulation, Bacterial , Genomics , Gene Ontology , Gene Regulatory Networks , Genomics/methods , High-Throughput Nucleotide Sequencing
4.
BMC Biol ; 16(1): 91, 2018 08 16.
Article in English | MEDLINE | ID: mdl-30115066

ABSTRACT

BACKGROUND: Our understanding of the regulation of gene expression has benefited from the availability of high-throughput technologies that interrogate the whole genome for the binding of specific transcription factors and gene expression profiles. In the case of widely used model organisms, such as Escherichia coli K-12, the new knowledge gained from these approaches needs to be integrated with the legacy of accumulated knowledge from genetic and molecular biology experiments conducted in the pre-genomic era in order to attain the deepest level of understanding possible based on the available data. RESULTS: In this paper, we describe an expansion of RegulonDB, the database containing the rich legacy of decades of classic molecular biology experiments supporting what we know about gene regulation and operon organization in E. coli K-12, to include the genome-wide dataset collections from 32 ChIP and 19 gSELEX publications, in addition to around 60 genome-wide expression profiles relevant to the functional significance of these datasets and used in their curation. Three essential features for the integration of this information coming from different methodological approaches are: first, a controlled vocabulary within an ontology for precisely defining growth conditions; second, the criteria to separate elements with enough evidence to consider them involved in gene regulation from isolated transcription factor binding sites without such support; and third, an expanded computational model supporting this knowledge. Altogether, this constitutes the basis for adequately gathering and enabling the comparisons and integration needed to manage and access such wealth of knowledge. CONCLUSIONS: This version 10.0 of RegulonDB is a first step toward what should become the unifying access point for current and future knowledge on gene regulation in E. coli K-12. Furthermore, this model platform and associated methodologies and criteria can be emulated for gathering knowledge on other microbial organisms.


Subject(s)
Databases as Topic , Escherichia coli K12/genetics , Gene Expression Regulation, Bacterial , Transcription, Genetic
5.
Nucleic Acids Res ; 45(D1): D543-D550, 2017 01 04.
Article in English | MEDLINE | ID: mdl-27899573

ABSTRACT

EcoCyc (EcoCyc.org) is a freely accessible, comprehensive database that collects and summarizes experimental data for Escherichia coli K-12, the best-studied bacterial model organism. New experimental discoveries about gene products, their function and regulation, new metabolic pathways, enzymes and cofactors are regularly added to EcoCyc. New SmartTable tools allow users to browse collections of related EcoCyc content. SmartTables can also serve as repositories for user- or curator-generated lists. EcoCyc now supports running and modifying E. coli metabolic models directly on the EcoCyc website.


Subject(s)
Computational Biology/methods , Databases, Genetic , Escherichia coli K12/genetics , Escherichia coli K12/metabolism , Energy Metabolism , Escherichia coli Proteins/genetics , Escherichia coli Proteins/metabolism , Gene Expression Regulation, Bacterial , Metabolic Networks and Pathways , Signal Transduction , Software , Transcription Factors/metabolism , Web Browser
6.
Mol Biosyst ; 8(11): 2932-6, 2012 Nov.
Article in English | MEDLINE | ID: mdl-22907621

ABSTRACT

In the present work we study the transcription-factor regulatory network that controls the synthesis of flagella in E. coli. Our objective is to address how the transcription-factor dynamics (in terms of their promoter activities and associated rates) correlate with their positions in the hierarchical organization of this regulatory network. Our results suggest that global-regulator promoters express at higher rates than those of local regulators, particularly when the bacterial populations are actively growing. Furthermore, promoter activity decreases together with the rate of cellular division. And finally, local-regulator promoters reach their maximal activity later than global-regulator promoters do. In summary, our results suggest a strong correlation between promoter activities and their hierarchical organization in this particular regulatory network.


Subject(s)
Escherichia coli/metabolism , Transcription Factors/metabolism , Models, Theoretical , Promoter Regions, Genetic/genetics , Transcription Factors/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...