Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
PLoS Comput Biol ; 17(9): e1008991, 2021 09.
Artigo em Inglês | MEDLINE | ID: mdl-34570758

RESUMO

Identification of biopolymer motifs represents a key step in the analysis of biological sequences. The MEME Suite is a widely used toolkit for comprehensive analysis of biopolymer motifs; however, these tools are poorly integrated within popular analysis frameworks like the R/Bioconductor project, creating barriers to their use. Here we present memes, an R package that provides a seamless R interface to a selection of popular MEME Suite tools. memes provides a novel "data aware" interface to these tools, enabling rapid and complex discriminative motif analysis workflows. In addition to interfacing with popular MEME Suite tools, memes leverages existing R/Bioconductor data structures to store the multidimensional data returned by MEME Suite tools for rapid data access and manipulation. Finally, memes provides data visualization capabilities to facilitate communication of results. memes is available as a Bioconductor package at https://bioconductor.org/packages/memes, and the source code can be found at github.com/snystrom/memes.


Assuntos
Motivos de Aminoácidos , Biologia Computacional/métodos , Motivos de Nucleotídeos , Software , Animais , Sequenciamento de Cromatina por Imunoprecipitação/estatística & dados numéricos , Biologia Computacional/estatística & dados numéricos , Interpretação Estatística de Dados , Humanos
2.
PLoS Comput Biol ; 17(9): e1009368, 2021 09.
Artigo em Inglês | MEDLINE | ID: mdl-34473698

RESUMO

The ChIP-seq signal of histone modifications at promoters is a good predictor of gene expression in different cellular contexts, but whether this is also true at enhancers is not clear. To address this issue, we develop quantitative models to characterize the relationship of gene expression with histone modifications at enhancers or promoters. We use embryonic stem cells (ESCs), which contain a full spectrum of active and repressed (poised) enhancers, to train predictive models. As many poised enhancers in ESCs switch towards an active state during differentiation, predictive models can also be trained on poised enhancers throughout differentiation and in development. Remarkably, we determine that histone modifications at enhancers, as well as promoters, are predictive of gene expression in ESCs and throughout differentiation and development. Importantly, we demonstrate that their contribution to the predictive models varies depending on their location in enhancers or promoters. Moreover, we use a local regression (LOESS) to normalize sequencing data from different sources, which allows us to apply predictive models trained in a specific cellular context to a different one. We conclude that the relationship between gene expression and histone modifications at enhancers is universal and different from promoters. Our study provides new insight into how histone modifications relate to gene expression based on their location in enhancers or promoters.


Assuntos
Elementos Facilitadores Genéticos , Expressão Gênica , Código das Histonas/genética , Modelos Genéticos , Regiões Promotoras Genéticas , Animais , Diferenciação Celular/genética , Células Cultivadas , Sequenciamento de Cromatina por Imunoprecipitação/estatística & dados numéricos , Biologia Computacional , Humanos , Camundongos , Células-Tronco Embrionárias Murinas/metabolismo , Análise de Regressão
3.
PLoS Comput Biol ; 17(7): e1009203, 2021 07.
Artigo em Inglês | MEDLINE | ID: mdl-34292930

RESUMO

Transcription factors (TFs) often function as a module including both master factors and mediators binding at cis-regulatory regions to modulate nearby gene transcription. ChIP-seq profiling of multiple TFs makes it feasible to infer functional TF modules. However, when inferring TF modules based on co-localization of ChIP-seq peaks, often many weak binding events are missed, especially for mediators, resulting in incomplete identification of modules. To address this problem, we develop a ChIP-seq data-driven Gibbs Sampler to infer Modules (ChIP-GSM) using a Bayesian framework that integrates ChIP-seq profiles of multiple TFs. ChIP-GSM samples read counts of module TFs iteratively to estimate the binding potential of a module to each region and, across all regions, estimates the module abundance. Using inferred module-region probabilistic bindings as feature units, ChIP-GSM then employs logistic regression to predict active regulatory elements. Validation of ChIP-GSM predicted regulatory regions on multiple independent datasets sharing the same context confirms the advantage of using TF modules for predicting regulatory activity. In a case study of K562 cells, we demonstrate that the ChIP-GSM inferred modules form as groups, activate gene expression at different time points, and mediate diverse functional cellular processes. Hence, ChIP-GSM infers biologically meaningful TF modules and improves the prediction accuracy of regulatory region activities.


Assuntos
Sequenciamento de Cromatina por Imunoprecipitação/métodos , Redes Reguladoras de Genes , Sequências Reguladoras de Ácido Nucleico/genética , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Teorema de Bayes , Sítios de Ligação/genética , Cromatina/genética , Cromatina/metabolismo , Sequenciamento de Cromatina por Imunoprecipitação/estatística & dados numéricos , Biologia Computacional , Elementos Facilitadores Genéticos , Epigênese Genética , Regulação da Expressão Gênica , Humanos , Células K562 , Células MCF-7 , Modelos Estatísticos , Regiões Promotoras Genéticas
4.
Commun Biol ; 4(1): 661, 2021 06 02.
Artigo em Inglês | MEDLINE | ID: mdl-34079046

RESUMO

Detecting changes in the activity of a transcription factor (TF) in response to a perturbation provides insights into the underlying cellular process. Transcription Factor Enrichment Analysis (TFEA) is a robust and reliable computational method that detects positional motif enrichment associated with changes in transcription observed in response to a perturbation. TFEA detects positional motif enrichment within a list of ranked regions of interest (ROIs), typically sites of RNA polymerase initiation inferred from regulatory data such as nascent transcription. Therefore, we also introduce muMerge, a statistically principled method of generating a consensus list of ROIs from multiple replicates and conditions. TFEA is broadly applicable to data that informs on transcriptional regulation including nascent transcription (eg. PRO-Seq), CAGE, histone ChIP-Seq, and accessibility data (e.g., ATAC-Seq). TFEA not only identifies the key regulators responding to a perturbation, but also temporally unravels regulatory networks with time series data. Consequently, TFEA serves as a hypothesis-generating tool that provides an easy, rigorous, and cost-effective means to broadly assess TF activity yielding new biological insights.


Assuntos
Fatores de Transcrição/metabolismo , Mama/citologia , Mama/metabolismo , Linhagem Celular , Sequenciamento de Cromatina por Imunoprecipitação/estatística & dados numéricos , Biologia Computacional/métodos , Simulação por Computador , Dexametasona/farmacologia , Células Epiteliais/metabolismo , Feminino , Regulação da Expressão Gênica , Técnicas Genéticas/estatística & dados numéricos , Células HCT116 , Humanos , Imidazóis/farmacologia , Piperazinas/farmacologia , Receptores de Glucocorticoides/efeitos dos fármacos , Receptores de Glucocorticoides/metabolismo , Fatores de Transcrição/genética , Transcrição Gênica , Proteína Supressora de Tumor p53/genética , Proteína Supressora de Tumor p53/metabolismo
5.
J Invest Dermatol ; 141(7): 1745-1753, 2021 07.
Artigo em Inglês | MEDLINE | ID: mdl-33607116

RESUMO

Psoriasis is a complex, chronic inflammatory skin disease characterized by keratinocyte hyperproliferation and a disordered immune response; however, its exact etiology remains unknown. To better understand the regulatory network underlying psoriasis, we explored the landscape of chromatin accessibility by using an assay for transposase-accessible chromatin using sequencing analysis of 15 psoriatic, 9 nonpsoriatic, and 19 normal skin tissue samples, and the chromatin accessibility data were integrated with genomic, epigenomic, and transcriptomic datasets. We identified 4,915 genomic regions that displayed differential accessibility in psoriatic samples compared with both nonpsoriatic and normal samples, nearly all of which exhibited an increased accessibility in psoriatic skin tissue. These differentially accessible regions tended to be more hypomethylated and correlated with the expression of their linked genes, which comprised several psoriasis susceptibility loci. Analyses of the differentially accessible region sequences showed that they were most highly enriched with FRA1 and/or activator protein-1 transcription factor DNA-binding motifs. We also found that AIM2, which encodes an important inflammasome component that triggers skin inflammation, is a direct target of FRA1 and/or activator protein-1. Our study provided clear insights and resources for an improved understanding of the pathogenesis of psoriasis. These disease-associated accessible regions might serve as therapeutic targets for psoriasis treatment in the future.


Assuntos
Cromatina/metabolismo , Redes Reguladoras de Genes/imunologia , Psoríase/genética , Transposases/metabolismo , Estudos de Casos e Controles , Sequenciamento de Cromatina por Imunoprecipitação/estatística & dados numéricos , Metilação de DNA , Conjuntos de Dados como Assunto , Epigenômica , Feminino , Humanos , Inflamassomos/genética , Inflamassomos/imunologia , Masculino , Psoríase/imunologia , Psoríase/patologia , RNA-Seq/estatística & dados numéricos , Pele/imunologia , Pele/patologia
6.
PLoS Comput Biol ; 16(11): e1008405, 2020 11.
Artigo em Inglês | MEDLINE | ID: mdl-33166290

RESUMO

Given the complexity and diversity of the cancer genomics profiles, it is challenging to identify distinct clusters from different cancer types. Numerous analyses have been conducted for this propose. Still, the methods they used always do not directly support the high-dimensional omics data across the whole genome (Such as ATAC-seq profiles). In this study, based on the deep adversarial learning, we present an end-to-end approach ClusterATAC to leverage high-dimensional features and explore the classification results. On the ATAC-seq dataset and RNA-seq dataset, ClusterATAC has achieved excellent performance. Since ATAC-seq data plays a crucial role in the study of the effects of non-coding regions on the molecular classification of cancers, we explore the clustering solution obtained by ClusterATAC on the pan-cancer ATAC dataset. In this solution, more than 70% of the clustering are single-tumor-type-dominant, and the vast majority of the remaining clusters are associated with similar tumor types. We explore the representative non-coding loci and their linked genes of each cluster and verify some results by the literature search. These results suggest that a large number of non-coding loci affect the development and progression of cancer through its linked genes, which can potentially advance cancer diagnosis and therapy.


Assuntos
Sequenciamento de Cromatina por Imunoprecipitação/estatística & dados numéricos , Aprendizado Profundo , Neoplasias/classificação , Neoplasias/genética , Cromatina/genética , Biologia Computacional , Bases de Dados de Ácidos Nucleicos/estatística & dados numéricos , Genômica/métodos , Genômica/estatística & dados numéricos , Humanos , Família Multigênica , Distribuição Normal , Oncogenes , RNA-Seq/estatística & dados numéricos
7.
Nat Commun ; 11(1): 2696, 2020 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-32483223

RESUMO

Conversion between cell types, e.g., by induced expression of master transcription factors, holds great promise for cellular therapy. Our ability to manipulate cell identity is constrained by incomplete information on cell identity genes (CIGs) and their expression regulation. Here, we develop CEFCIG, an artificial intelligent framework to uncover CIGs and further define their master regulators. On the basis of machine learning, CEFCIG reveals unique histone codes for transcriptional regulation of reported CIGs, and utilizes these codes to predict CIGs and their master regulators with high accuracy. Applying CEFCIG to 1,005 epigenetic profiles, our analysis uncovers the landscape of regulation network for identity genes in individual cell or tissue types. Together, this work provides insights into cell identity regulation, and delivers a powerful technique to facilitate regenerative medicine.


Assuntos
Células/classificação , Células/metabolismo , Código das Histonas , Aprendizado de Máquina , Algoritmos , Células/citologia , Sequenciamento de Cromatina por Imunoprecipitação/estatística & dados numéricos , Bases de Dados Genéticas/estatística & dados numéricos , Epigênese Genética , Regulação da Expressão Gênica , Redes Reguladoras de Genes , Células Endoteliais da Veia Umbilical Humana/citologia , Células Endoteliais da Veia Umbilical Humana/metabolismo , Humanos , Fenótipo , Células-Tronco Pluripotentes/citologia , Células-Tronco Pluripotentes/metabolismo , RNA-Seq/estatística & dados numéricos , Medicina Regenerativa , Fatores de Transcrição/metabolismo
8.
PLoS Comput Biol ; 15(8): e1007227, 2019 08.
Artigo em Inglês | MEDLINE | ID: mdl-31425505

RESUMO

RNA-protein interaction plays important roles in post-transcriptional regulation. Recent advancements in cross-linking and immunoprecipitation followed by sequencing (CLIP-seq) technologies make it possible to detect the binding peaks of a given RNA binding protein (RBP) at transcriptome scale. However, it is still challenging to predict the functional consequences of RBP binding peaks. In this study, we propose the Protein-RNA Association Strength (PRAS), which integrates the intensities and positions of the binding peaks of RBPs for functional mRNA targets prediction. We illustrate the superiority of PRAS over existing approaches on predicting the functional targets of two related but divergent CELF (CUGBP, ELAV-like factor) RBPs in mouse brain and muscle. We also demonstrate the potential of PRAS for wide adoption by applying it to the enhanced CLIP-seq (eCLIP) datasets of 37 RNA decay related RBPs in two human cell lines. PRAS can be utilized to investigate any RBPs with available CLIP-seq peaks. PRAS is freely available at http://ouyanglab.jax.org/pras/.


Assuntos
Sequenciamento de Cromatina por Imunoprecipitação/estatística & dados numéricos , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Proteínas de Ligação a RNA/metabolismo , Software , Animais , Sequência de Bases , Sítios de Ligação/genética , Encéfalo/metabolismo , Proteínas CELF/genética , Proteínas CELF/metabolismo , Biologia Computacional , Bases de Dados de Proteínas , Perfilação da Expressão Gênica , Células Hep G2 , Humanos , Células K562 , Camundongos , Músculos/metabolismo , Proteínas de Ligação a RNA/genética
9.
Curr Med Chem ; 26(42): 7641-7654, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-29848263

RESUMO

BACKGROUND: Transcription factors are DNA-binding proteins that play key roles in many fundamental biological processes. Unraveling their interactions with DNA is essential to identify their target genes and understand the regulatory network. Genome-wide identification of their binding sites became feasible thanks to recent progress in experimental and computational approaches. ChIP-chip, ChIP-seq, and ChIP-exo are three widely used techniques to demarcate genome-wide transcription factor binding sites. OBJECTIVE: This review aims to provide an overview of these three techniques including their experiment procedures, computational approaches, and popular analytic tools. CONCLUSION: ChIP-chip, ChIP-seq, and ChIP-exo have been the major techniques to study genome- wide in vivo protein-DNA interaction. Due to the rapid development of next-generation sequencing technology, array-based ChIP-chip is deprecated and ChIP-seq has become the most widely used technique to identify transcription factor binding sites in genome-wide. The newly developed ChIP-exo further improves the spatial resolution to single nucleotide. Numerous tools have been developed to analyze ChIP-chip, ChIP-seq and ChIP-exo data. However, different programs may employ different mechanisms or underlying algorithms thus each will inherently include its own set of statistical assumption and bias. So choosing the most appropriate analytic program for a given experiment needs careful considerations. Moreover, most programs only have command line interface so their installation and usage will require basic computation expertise in Unix/Linux.


Assuntos
Proteínas de Ligação a DNA/metabolismo , DNA/metabolismo , Genoma , Fatores de Transcrição/metabolismo , Animais , Sítios de Ligação , Sequenciamento de Cromatina por Imunoprecipitação/métodos , Sequenciamento de Cromatina por Imunoprecipitação/estatística & dados numéricos , Biologia Computacional , Humanos , Ligação Proteica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...