Pesquisa | Portal Regional da BVS (teste)

UniBind: maps of high-confidence direct TF-DNA interactions across nine species.

Puig, Rafael Riudavets; Boddie, Paul; Khan, Aziz; Castro-Mondragon, Jaime Abraham; Mathelier, Anthony.

BMC Genomics ; 22(1): 482, 2021 Jun 26.

Artigo em Inglês | MEDLINE | ID: mdl-34174819

RESUMO

BACKGROUND: Transcription factors (TFs) bind specifically to TF binding sites (TFBSs) at cis-regulatory regions to control transcription. It is critical to locate these TF-DNA interactions to understand transcriptional regulation. Efforts to predict bona fide TFBSs benefit from the availability of experimental data mapping DNA binding regions of TFs (chromatin immunoprecipitation followed by sequencing - ChIP-seq). RESULTS: In this study, we processed ~ 10,000 public ChIP-seq datasets from nine species to provide high-quality TFBS predictions. After quality control, it culminated with the prediction of ~ 56 million TFBSs with experimental and computational support for direct TF-DNA interactions for 644 TFs in > 1000 cell lines and tissues. These TFBSs were used to predict > 197,000 cis-regulatory modules representing clusters of binding events in the corresponding genomes. The high-quality of the TFBSs was reinforced by their evolutionary conservation, enrichment at active cis-regulatory regions, and capacity to predict combinatorial binding of TFs. Further, we confirmed that the cell type and tissue specificity of enhancer activity was correlated with the number of TFs with binding sites predicted in these regions. All the data is provided to the community through the UniBind database that can be accessed through its web-interface ( https://unibind.uio.no/ ), a dedicated RESTful API, and as genomic tracks. Finally, we provide an enrichment tool, available as a web-service and an R package, for users to find TFs with enriched TFBSs in a set of provided genomic regions. CONCLUSIONS: UniBind is the first resource of its kind, providing the largest collection of high-confidence direct TF-DNA interactions in nine species.

Assuntos

Sequenciamento de Cromatina por Imunoprecipitação , DNA , Sítios de Ligação , Imunoprecipitação da Cromatina , Biologia Computacional , DNA/metabolismo , Ligação Proteica

RhizoBindingSites, a Database of DNA-Binding Motifs in Nitrogen-Fixing Bacteria Inferred Using a Footprint Discovery Approach.

Taboada-Castro, Hermenegildo; Castro-Mondragón, Jaime Abraham; Aguilar-Vera, Alejandro; Hernández-Álvarez, Alfredo José; van Helden, Jacques; Encarnación-Guevara, Sergio.

Front Microbiol ; 11: 567471, 2020.

Artigo em Inglês | MEDLINE | ID: mdl-33250866

RESUMO

Basic knowledge of transcriptional regulation is needed to understand the mechanisms governing biological processes, i.e., nitrogen fixation by Rhizobiales bacteria in symbiosis with leguminous plants. The RhizoBindingSites database is a computer-assisted framework providing motif-gene-associated conserved sequences potentially implicated in transcriptional regulation in nine symbiotic species. A dyad analysis algorithm was used to deduce motifs in the upstream regulatory region of orthologous genes, and only motifs also located in the gene seed promoter with a p-value of 1e-4 were accepted. A genomic scan analysis of the upstoream sequences with these motifs was performed. These predicted binding sites were categorized according to low, medium and high homology between the matrix and the upstream regulatory sequence. On average, 62.7% of the genes had a motif, accounting for 80.44% of the genes per genome, with 19613 matrices (a matrix is a representation of a motif). The RhizoBindingSites database provides motif and gene information, motif conservation in the order Rhizobiales, matrices, motif logos, regulatory networks constructed from theoretical or experimental data, a criterion for selecting motifs and a guide for users. The RhizoBindingSites database is freely available online at rhizobindingsites.ccg.unam.mx.

RSAT matrix-clustering: dynamic exploration and redundancy reduction of transcription factor binding motif collections.

Castro-Mondragon, Jaime Abraham; Jaeger, Sébastien; Thieffry, Denis; Thomas-Chollier, Morgane; van Helden, Jacques.

Nucleic Acids Res ; 45(13): e119, 2017 Jul 27.

Artigo em Inglês | MEDLINE | ID: mdl-28591841

RESUMO

Transcription factor (TF) databases contain multitudes of binding motifs (TFBMs) from various sources, from which non-redundant collections are derived by manual curation. The advent of high-throughput methods stimulated the production of novel collections with increasing numbers of motifs. Meta-databases, built by merging these collections, contain redundant versions, because available tools are not suited to automatically identify and explore biologically relevant clusters among thousands of motifs. Motif discovery from genome-scale data sets (e.g. ChIP-seq) also produces redundant motifs, hampering the interpretation of results. We present matrix-clustering, a versatile tool that clusters similar TFBMs into multiple trees, and automatically creates non-redundant TFBM collections. A feature unique to matrix-clustering is its dynamic visualisation of aligned TFBMs, and its capability to simultaneously treat multiple collections from various sources. We demonstrate that matrix-clustering considerably simplifies the interpretation of combined results from multiple motif discovery tools, and highlights biologically relevant variations of similar motifs. We also ran a large-scale application to cluster â¼11 000 motifs from 24 entire databases, showing that matrix-clustering correctly groups motifs belonging to the same TF families, and drastically reduced motif redundancy. matrix-clustering is integrated within the RSAT suite (http://rsat.eu/), accessible through a user-friendly web interface or command-line for its integration in pipelines.

Assuntos

Bases de Dados de Proteínas/estatística & dados numéricos , Fatores de Transcrição/química , Fatores de Transcrição/metabolismo , Algoritmos , Motivos de Aminoácidos , Sequência de Aminoácidos , Animais , Sítios de Ligação/genética , Imunoprecipitação da Cromatina , Análise por Conglomerados , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Camundongos , Ligação Proteica , Análise de Sequência de Proteína , Fatores de Transcrição/genética

RegulonDB version 9.0: high-level integration of gene regulation, coexpression, motif clustering and beyond.

Gama-Castro, Socorro; Salgado, Heladia; Santos-Zavaleta, Alberto; Ledezma-Tejeida, Daniela; Muñiz-Rascado, Luis; García-Sotelo, Jair Santiago; Alquicira-Hernández, Kevin; Martínez-Flores, Irma; Pannier, Lucia; Castro-Mondragón, Jaime Abraham; Medina-Rivera, Alejandra; Solano-Lira, Hilda; Bonavides-Martínez, César; Pérez-Rueda, Ernesto; Alquicira-Hernández, Shirley; Porrón-Sotelo, Liliana; López-Fuentes, Alejandra; Hernández-Koutoucheva, Anastasia; Del Moral-Chávez, Víctor; Rinaldi, Fabio; Collado-Vides, Julio.

Nucleic Acids Res ; 44(D1): D133-43, 2016 Jan 04.

Artigo em Inglês | MEDLINE | ID: mdl-26527724

RESUMO

RegulonDB (http://regulondb.ccg.unam.mx) is one of the most useful and important resources on bacterial gene regulation,as it integrates the scattered scientific knowledge of the best-characterized organism, Escherichia coli K-12, in a database that organizes large amounts of data. Its electronic format enables researchers to compare their results with the legacy of previous knowledge and supports bioinformatics tools and model building. Here, we summarize our progress with RegulonDB since our last Nucleic Acids Research publication describing RegulonDB, in 2013. In addition to maintaining curation up-to-date, we report a collection of 232 interactions with small RNAs affecting 192 genes, and the complete repertoire of 189 Elementary Genetic Sensory-Response units (GENSOR units), integrating the signal, regulatory interactions, and metabolic pathways they govern. These additions represent major progress to a higher level of understanding of regulated processes. We have updated the computationally predicted transcription factors, which total 304 (184 with experimental evidence and 120 from computational predictions); we updated our position-weight matrices and have included tools for clustering them in evolutionary families. We describe our semiautomatic strategy to accelerate curation, including datasets from high-throughput experiments, a novel coexpression distance to search for 'neighborhood' genes to known operons and regulons, and computational developments.

Assuntos

Bases de Dados Genéticas , Escherichia coli K12/genética , Regulação Bacteriana da Expressão Gênica , Regulon , Análise por Conglomerados , Escherichia coli K12/metabolismo , Redes Reguladoras de Genes , Óperon , Matrizes de Pontuação de Posição Específica , Pequeno RNA não Traduzido/metabolismo , Fatores de Transcrição/classificação

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA