Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 23
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Nat Commun ; 10(1): 4613, 2019 10 10.
Artigo em Inglês | MEDLINE | ID: mdl-31601804

RESUMO

Characterizing and interpreting heterogeneous mixtures at the cellular level is a critical problem in genomics. Single-cell assays offer an opportunity to resolve cellular level heterogeneity, e.g., scRNA-seq enables single-cell expression profiling, and scATAC-seq identifies active regulatory elements. Furthermore, while scHi-C can measure the chromatin contacts (i.e., loops) between active regulatory elements to target genes in single cells, bulk HiChIP can measure such contacts in a higher resolution. In this work, we introduce DC3 (De-Convolution and Coupled-Clustering) as a method for the joint analysis of various bulk and single-cell data such as HiChIP, RNA-seq and ATAC-seq from the same heterogeneous cell population. DC3 can simultaneously identify distinct subpopulations, assign single cells to the subpopulations (i.e., clustering) and de-convolve the bulk data into subpopulation-specific data. The subpopulation-specific profiles of gene expression, chromatin accessibility and enhancer-promoter contact obtained by DC3 provide a comprehensive characterization of the gene regulatory system in each subpopulation.


Assuntos
Algoritmos , Análise por Conglomerados , Perfilação da Expressão Gênica/estatística & dados numéricos , Genômica/estatística & dados numéricos , Análise de Célula Única/estatística & dados numéricos , Animais , Linhagem Celular , Cromatina , Imunoprecipitação da Cromatina/estatística & dados numéricos , Simulação por Computador , Perfilação da Expressão Gênica/métodos , Redes Reguladoras de Genes , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Humanos , Camundongos , Regiões Promotoras Genéticas , Análise de Célula Única/métodos
2.
Pac Symp Biocomput ; 24: 184-195, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30864321

RESUMO

Genetic variations of the human genome are linked to many disease phenotypes. While whole-genome sequencing and genome-wide association studies (GWAS) have uncovered a number of genotype-phenotype associations, their functional interpretation remains challenging given most single nucleotide polymorphisms (SNPs) fall into the non-coding region of the genome. Advances in chromatin immunoprecipitation sequencing (ChIP-seq) have made large-scale repositories of epigenetic data available, allowing investigation of coordinated mechanisms of epigenetic markers and transcriptional regulation and their influence on biological function. To address this, we propose SNPs2ChIP, a method to infer biological functions of non-coding variants through unsupervised statistical learning methods applied to publicly-available epigenetic datasets. We systematically characterized latent factors by applying singular value decomposition to ChIP-seq tracks of lymphoblastoid cell lines, and annotated the biological function of each latent factor using the genomic region enrichment analysis tool. Using these annotated latent factors as reference, we developed SNPs2ChIP, a pipeline that takes genomic region(s) as an input, identifies the relevant latent factors with quantitative scores, and returns them along with their inferred functions. As a case study, we focused on systemic lupus erythematosus and demonstrated our method's ability to infer relevant biological function. We systematically applied SNPs2ChIP on publicly available datasets, including known GWAS associations from the GWAS catalogue and ChIP-seq peaks from a previously published study. Our approach to leverage latent patterns across genome-wide epigenetic datasets to infer the biological function will advance understanding of the genetics of human diseases by accelerating the interpretation of non-coding genomes.


Assuntos
Imunoprecipitação da Cromatina/estatística & dados numéricos , Polimorfismo de Nucleotídeo Único , Algoritmos , Linhagem Celular , Biologia Computacional/métodos , Bases de Dados de Ácidos Nucleicos/estatística & dados numéricos , Epigênese Genética , Estudos de Associação Genética , Genoma Humano , Estudo de Associação Genômica Ampla/estatística & dados numéricos , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Humanos , Lúpus Eritematoso Sistêmico/genética , Linfócitos/metabolismo , Receptores de Calcitriol/genética
3.
BMC Genomics ; 20(1): 6, 2019 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-30611200

RESUMO

BACKGROUND: Sequencing data has become a standard measure of diverse cellular activities. For example, gene expression is accurately measured by RNA sequencing (RNA-Seq) libraries, protein-DNA interactions are captured by chromatin immunoprecipitation sequencing (ChIP-Seq), protein-RNA interactions by crosslinking immunoprecipitation sequencing (CLIP-Seq) or RNA immunoprecipitation (RIP-Seq) sequencing, DNA accessibility by assay for transposase-accessible chromatin (ATAC-Seq), DNase or MNase sequencing libraries. The processing of these sequencing techniques involves library-specific approaches. However, in all cases, once the sequencing libraries are processed, the result is a count table specifying the estimated number of reads originating from each genomic locus. Differential analysis to determine which loci have different cellular activity under different conditions starts with the count table and iterates through a cycle of data assessment, preparation and analysis. Such complex analysis often relies on multiple programs and is therefore a challenge for those without programming skills. RESULTS: We developed DEBrowser as an R bioconductor project to interactively visualize every step of the differential analysis, without programming. The application provides a rich and interactive web based graphical user interface built on R's shiny infrastructure. DEBrowser allows users to visualize data with various types of graphs that can be explored further by selecting and re-plotting any desired subset of data. Using the visualization approaches provided, users can determine and correct technical variations such as batch effects and sequencing depth that affect differential analysis. We show DEBrowser's ease of use by reproducing the analysis of two previously published data sets. CONCLUSIONS: DEBrowser is a flexible, intuitive, web-based analysis platform that enables an iterative and interactive analysis of count data without any requirement of programming knowledge.


Assuntos
Imunoprecipitação da Cromatina/estatística & dados numéricos , Genoma Humano/genética , Análise de Sequência de RNA/estatística & dados numéricos , Software , Cromatina/genética , DNA/genética , Proteínas de Ligação a DNA/genética , Interpretação Estatística de Dados , Genômica/estatística & dados numéricos , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Humanos , Análise de Sequência de DNA
4.
PLoS Comput Biol ; 14(4): e1006090, 2018 04.
Artigo em Inglês | MEDLINE | ID: mdl-29684008

RESUMO

Genome-wide in vivo protein-DNA interactions are routinely mapped using high-throughput chromatin immunoprecipitation (ChIP). ChIP-reported regions are typically investigated for enriched sequence-motifs, which are likely to model the DNA-binding specificity of the profiled protein and/or of co-occurring proteins. However, simple enrichment analyses can miss insights into the binding-activity of the protein. Note that ChIP reports regions making direct contact with the protein as well as those binding through intermediaries. For example, consider a ChIP experiment targeting protein X, which binds DNA at its cognate sites, but simultaneously interacts with four other proteins. Each of these proteins also binds to its own specific cognate sites along distant parts of the genome, a scenario consistent with the current view of transcriptional hubs and chromatin loops. Since ChIP will pull down all X-associated regions, the final reported data will be a union of five distinct sets of regions, each containing binding sites of one of the five proteins, respectively. Characterizing all five different motifs and the corresponding sets is important to interpret the ChIP experiment and ultimately, the role of X in regulation. We present diversity which attempts exactly this: it partitions the data so that each partition can be characterized with its own de novo motif. Diversity uses a Bayesian approach to identify the optimal number of motifs and the associated partitions, which together explain the entire dataset. This is in contrast to standard motif finders, which report motifs individually enriched in the data, but do not necessarily explain all reported regions. We show that the different motifs and associated regions identified by diversity give insights into the various complexes that may be forming along the chromatin, something that has so far not been attempted from ChIP data. Webserver at http://diversity.ncl.res.in/; standalone (Mac OS X/Linux) from https://github.com/NarlikarLab/DIVERSITY/releases/tag/v1.0.0.


Assuntos
Imunoprecipitação da Cromatina/estatística & dados numéricos , Software , Algoritmos , Animais , Teorema de Bayes , Sítios de Ligação , Cromatina/genética , Cromatina/metabolismo , Biologia Computacional , DNA/genética , DNA/metabolismo , Proteínas de Ligação a DNA/genética , Proteínas de Ligação a DNA/metabolismo , Evolução Molecular , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Humanos , Neurônios/metabolismo , Motivos de Nucleotídeos , Ligação Proteica , Análise de Sequência de DNA/estatística & dados numéricos , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo
5.
Brief Bioinform ; 19(5): 1069-1081, 2018 09 28.
Artigo em Inglês | MEDLINE | ID: mdl-28334268

RESUMO

Transcription factors are proteins that bind to specific DNA sequences and play important roles in controlling the expression levels of their target genes. Hence, prediction of transcription factor binding sites (TFBSs) provides a solid foundation for inferring gene regulatory mechanisms and building regulatory networks for a genome. Chromatin immunoprecipitation sequencing (ChIP-seq) technology can generate large-scale experimental data for such protein-DNA interactions, providing an unprecedented opportunity to identify TFBSs (a.k.a. cis-regulatory motifs). The bottleneck, however, is the lack of robust mathematical models, as well as efficient computational methods for TFBS prediction to make effective use of massive ChIP-seq data sets in the public domain. The purpose of this study is to review existing motif-finding methods for ChIP-seq data from an algorithmic perspective and provide new computational insight into this field. The state-of-the-art methods were shown through summarizing eight representative motif-finding algorithms along with corresponding challenges, and introducing some important relative functions according to specific biological demands, including discriminative motif finding and cofactor motifs analysis. Finally, potential directions and plans for ChIP-seq-based motif-finding tools were showcased in support of future algorithm development.


Assuntos
Algoritmos , Redes Reguladoras de Genes , Software , Sequência de Bases , Sítios de Ligação/genética , Imunoprecipitação da Cromatina/estatística & dados numéricos , Biologia Computacional/métodos , DNA/genética , DNA/metabolismo , Humanos , Análise de Sequência de DNA/estatística & dados numéricos , Fatores de Transcrição/metabolismo
6.
Nucleic Acids Res ; 43(6): e40, 2015 Mar 31.
Artigo em Inglês | MEDLINE | ID: mdl-25564527

RESUMO

RNA-seq is a sensitive and accurate technique to compare steady-state levels of RNA between different cellular states. However, as it does not provide an account of transcriptional activity per se, other technologies are needed to more precisely determine acute transcriptional responses. Here, we have developed an easy, sensitive and accurate novel computational method, IRNA-SEQ: , for genome-wide assessment of transcriptional activity based on analysis of intron coverage from total RNA-seq data. Comparison of the results derived from iRNA-seq analyses with parallel results derived using current methods for genome-wide determination of transcriptional activity, i.e. global run-on (GRO)-seq and RNA polymerase II (RNAPII) ChIP-seq, demonstrate that iRNA-seq provides similar results in terms of number of regulated genes and their fold change. However, unlike the current methods that are all very labor-intensive and demanding in terms of sample material and technologies, iRNA-seq is cheap and easy and requires very little sample material. In conclusion, iRNA-seq offers an attractive novel alternative to current methods for determination of changes in transcriptional activity at a genome-wide level.


Assuntos
Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , Linhagem Celular , Imunoprecipitação da Cromatina/métodos , Imunoprecipitação da Cromatina/estatística & dados numéricos , Perfilação da Expressão Gênica/estatística & dados numéricos , Regulação da Expressão Gênica , Genoma Humano , Humanos , Íntrons , Análise de Sequência de RNA/estatística & dados numéricos
7.
Nucleic Acids Res ; 43(6): e38, 2015 Mar 31.
Artigo em Inglês | MEDLINE | ID: mdl-25539918

RESUMO

Genome-wide chromatin immunoprecipitation (ChIP) studies have brought significant insight into the genomic localization of chromatin-associated proteins and histone modifications. The large amount of data generated by these analyses, however, require approaches that enable rapid validation and analysis of biological relevance. Furthermore, there are still protein and modification targets that are difficult to detect using standard ChIP methods. To address these issues, we developed an immediate chromatin immunoprecipitation procedure which we call ZipChip. ZipChip significantly reduces the time and increases sensitivity allowing for rapid screening of multiple loci. Here we describe how ZipChIP enables detection of histone modifications (H3K4 mono- and trimethylation) and two yeast histone demethylases, Jhd2 and Rph1, which were previously difficult to detect using standard methods. Furthermore, we demonstrate the versatility of ZipChIP by analyzing the enrichment of the histone deacetylase Sir2 at heterochromatin in yeast and enrichment of the chromatin remodeler, PICKLE, at euchromatin in Arabidopsis thaliana.


Assuntos
Imunoprecipitação da Cromatina/métodos , Reação em Cadeia da Polimerase em Tempo Real/métodos , Actinas/genética , Actinas/metabolismo , Arabidopsis/genética , Arabidopsis/metabolismo , Proteínas de Arabidopsis/genética , Proteínas de Arabidopsis/metabolismo , Cromatina/genética , Cromatina/metabolismo , Imunoprecipitação da Cromatina/estatística & dados numéricos , DNA Helicases/genética , DNA Helicases/metabolismo , Proteínas de Ligação a DNA/genética , Proteínas de Ligação a DNA/metabolismo , Genes Fúngicos , Genes de Plantas , Histona Desmetilases/genética , Histona Desmetilases/metabolismo , Histonas/genética , Histonas/metabolismo , Histona Desmetilases com o Domínio Jumonji/genética , Histona Desmetilases com o Domínio Jumonji/metabolismo , Fases de Leitura Aberta , Regiões Promotoras Genéticas , Proteínas Repressoras/genética , Proteínas Repressoras/metabolismo , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo , Proteínas Reguladoras de Informação Silenciosa de Saccharomyces cerevisiae/genética , Proteínas Reguladoras de Informação Silenciosa de Saccharomyces cerevisiae/metabolismo , Sirtuína 2/genética , Sirtuína 2/metabolismo
8.
Pac Symp Biocomput ; : 320-31, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23424137

RESUMO

We have developed a novel approach called ChIPModule to systematically discover transcription factors and their cofactors from ChIP-seq data. Given a ChIP-seq dataset and the binding patterns of a large number of transcription factors, ChIPModule can efficiently identify groups of transcription factors, whose binding sites significantly co-occur in the ChIP-seq peak regions. By testing ChIPModule on simulated data and experimental data, we have shown that ChIPModule identifies known cofactors of transcription factors, and predicts new cofactors that are supported by literature. ChIPModule provides a useful tool for studying gene transcriptional regulation.


Assuntos
Imunoprecipitação da Cromatina/estatística & dados numéricos , Análise de Sequência/estatística & dados numéricos , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Sítios de Ligação/genética , Biologia Computacional , Bases de Dados Genéticas/estatística & dados numéricos , Humanos
9.
Brief Bioinform ; 14(2): 225-37, 2013 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-22517426

RESUMO

Motif discovery has been one of the most widely studied problems in bioinformatics ever since genomic and protein sequences have been available. In particular, its application to the de novo prediction of putative over-represented transcription factor binding sites in nucleotide sequences has been, and still is, one of the most challenging flavors of the problem. Recently, novel experimental techniques like chromatin immunoprecipitation (ChIP) have been introduced, permitting the genome-wide identification of protein-DNA interactions. ChIP, applied to transcription factors and coupled with genome tiling arrays (ChIP on Chip) or next-generation sequencing technologies (ChIP-Seq) has opened new avenues in research, as well as posed new challenges to bioinformaticians developing algorithms and methods for motif discovery.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Elementos Reguladores de Transcrição , Fatores de Transcrição/metabolismo , Algoritmos , Animais , Sítios de Ligação/genética , Imunoprecipitação da Cromatina/estatística & dados numéricos , Biologia Computacional , Sequência Consenso , DNA/genética , DNA/metabolismo , Perfilação da Expressão Gênica/estatística & dados numéricos , Humanos
10.
PLoS One ; 7(1): e28272, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22238575

RESUMO

Chromatin Immuno Precipitation (ChIP) profiling detects in vivo protein-DNA binding, and has revealed a large combinatorial complexity in the binding of chromatin associated proteins and their post-translational modifications. To fully explore the spatial and combinatorial patterns in ChIP-profiling data and detect potentially meaningful patterns, the areas of enrichment must be aligned and clustered, which is an algorithmically and computationally challenging task. We have developed CATCHprofiles, a novel tool for exhaustive pattern detection in ChIP profiling data. CATCHprofiles is built upon a computationally efficient implementation for the exhaustive alignment and hierarchical clustering of ChIP profiling data. The tool features a graphical interface for examination and browsing of the clustering results. CATCHprofiles requires no prior knowledge about functional sites, detects known binding patterns "ab initio", and enables the detection of new patterns from ChIP data at a high resolution, exemplified by the detection of asymmetric histone and histone modification patterns around H2A.Z-enriched sites. CATCHprofiles' capability for exhaustive analysis combined with its ease-of-use makes it an invaluable tool for explorative research based on ChIP profiling data. CATCHprofiles and the CATCH algorithm run on all platforms and is available for free through the CATCH website: http://catch.cmbi.ru.nl/. User support is available by subscribing to the mailing list catch-users@bioinformatics.org.


Assuntos
Imunoprecipitação da Cromatina/estatística & dados numéricos , Interpretação Estatística de Dados , Análise em Microsséries/estatística & dados numéricos , Alinhamento de Sequência , Software , Algoritmos , Sequência de Bases , Células Cultivadas , Imunoprecipitação da Cromatina/métodos , Análise por Conglomerados , Biologia Computacional/métodos , Eficiência , Perfilação da Expressão Gênica/métodos , Perfilação da Expressão Gênica/estatística & dados numéricos , Humanos , Modelos Biológicos , Dados de Sequência Molecular , Regiões Promotoras Genéticas/genética , Alinhamento de Sequência/métodos , Alinhamento de Sequência/estatística & dados numéricos
11.
Biostatistics ; 13(1): 113-28, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-21914728

RESUMO

Chromatin immunoprecipitation followed by next generation sequencing (ChIP-seq) is a powerful technique that is being used in a wide range of biological studies including genome-wide measurements of protein-DNA interactions, DNA methylation, and histone modifications. The vast amount of data and biases introduced by sequencing and/or genome mapping pose new challenges and call for effective methods and fast computer programs for statistical analysis. To systematically model ChIP-seq data, we build a dynamic signal profile for each chromosome and then model the profile using a fully Bayesian hidden Ising model. The proposed model naturally takes into account spatial dependency and global and local distributions of sequence tags. It can be used for one-sample and two-sample analyses. Through model diagnosis, the proposed method can detect falsely enriched regions caused by sequencing and/or mapping errors, which is usually not offered by the existing hypothesis-testing-based methods. The proposed method is illustrated using 3 transcription factor (TF) ChIP-seq data sets and 2 mixed ChIP-seq data sets and compared with 4 popular and/or well-documented methods: MACS, CisGenome, BayesPeak, and SISSRs. The results indicate that the proposed method achieves equivalent or higher sensitivity and spatial resolution in detecting TF binding sites with false discovery rate at a much lower level.


Assuntos
Imunoprecipitação da Cromatina/estatística & dados numéricos , Modelos Estatísticos , Análise de Sequência de DNA/estatística & dados numéricos , Algoritmos , Teorema de Bayes , Sítios de Ligação/genética , Biotecnologia , DNA/genética , DNA/metabolismo , Interpretação Estatística de Dados , Bases de Dados de Ácidos Nucleicos , Humanos , Cadeias de Markov , Fatores de Transcrição/metabolismo
12.
J Bioinform Comput Biol ; 9(2): 269-82, 2011 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-21523932

RESUMO

New high-throughput sequencing technologies can generate millions of short sequences in a single experiment. As the size of the data increases, comparison of multiple experiments on different cell lines under different experimental conditions becomes a big challenge. In this paper, we investigate ways to compare multiple ChIP-sequencing experiments. We specifically studied epigenetic regulation of breast cancer and the effect of estrogen using 50 ChIP-sequencing data from Illumina Genome Analyzer II. First, we evaluate the correlation among different experiments focusing on the total number of reads in transcribed and promoter regions of the genome. Then, we adopt the method that is used to identify the most stable genes in RT-PCR experiments to understand background signal across all of the experiments and to identify the most variable transcribed and promoter regions of the genome. We observed that the most variable genes for transcribed regions and promoter regions are very distinct. Gene ontology and function enrichment analysis on these most variable genes demonstrate the biological relevance of the results. In this study, we present a method that can effectively select differential regions of the genome based on protein-binding profiles over multiple experiments using real data points without any normalization among the samples.


Assuntos
Imunoprecipitação da Cromatina/estatística & dados numéricos , Algoritmos , Neoplasias da Mama/genética , Neoplasias da Mama/metabolismo , Linhagem Celular , Linhagem Celular Tumoral , Biologia Computacional , Epigênese Genética , Feminino , Genoma Humano , Humanos , Ligação Proteica
13.
Hum Genomics ; 5(2): 117-23, 2011 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-21296745

RESUMO

Chromatin immunoprecipitation followed by massively parallel next-generation sequencing (ChIP-seq) is a valuable experimental strategy for assaying protein-DNA interaction over the whole genome. Many computational tools have been designed to find the peaks of the signals corresponding to protein binding sites. In this paper, three computational methods, ChIP-seq processing pipeline (spp), PeakSeq and CisGenome, used in ChIP-seq data analysis are reviewed. There is also a comparison of how they agree and disagree on finding peaks using the publically available Signal Transducers and Activators of Transcription protein 1 (STAT1) and RNA polymerase II (PolII) datasets with corresponding negative controls.


Assuntos
Imunoprecipitação da Cromatina/métodos , Análise de Sequência de DNA , Software , Imunoprecipitação da Cromatina/estatística & dados numéricos , Humanos , Ligação Proteica , RNA Polimerase II/genética , Projetos de Pesquisa , Fator de Transcrição STAT1/genética
14.
Biometrics ; 66(4): 1284-94, 2010 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-20128774

RESUMO

ChIP-chip experiments are procedures that combine chromatin immunoprecipitation (ChIP) and DNA microarray (chip) technology to study a variety of biological problems, including protein-DNA interaction, histone modification, and DNA methylation. The most important feature of ChIP-chip data is that the intensity measurements of probes are spatially correlated because the DNA fragments are hybridized to neighboring probes in the experiments. We propose a simple, but powerful Bayesian hierarchical approach to ChIP-chip data through an Ising model with high-order interactions. The proposed method naturally takes into account the intrinsic spatial structure of the data and can be used to analyze data from multiple platforms with different genomic resolutions. The model parameters are estimated using the Gibbs sampler. The proposed method is illustrated using two publicly available data sets from Affymetrix and Agilent platforms, and compared with three alternative Bayesian methods, namely, Bayesian hierarchical model, hierarchical gamma mixture model, and Tilemap hidden Markov model. The numerical results indicate that the proposed method performs as well as the other three methods for the data from Affymetrix tiling arrays, but significantly outperforms the other three methods for the data from Agilent promoter arrays. In addition, we find that the proposed method has better operating characteristics in terms of sensitivities and false discovery rates under various scenarios.


Assuntos
Teorema de Bayes , Imunoprecipitação da Cromatina/estatística & dados numéricos , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Humanos , Métodos , Sensibilidade e Especificidade
15.
Genome Biol ; 10(12): R142, 2009.
Artigo em Inglês | MEDLINE | ID: mdl-20028542

RESUMO

We present CSDeconv, a computational method that determines locations of transcription factor binding from ChIP-seq data. CSDeconv differs from prior methods in that it uses a blind deconvolution approach that allows closely-spaced binding sites to be called accurately. We apply CSDeconv to novel ChIP-seq data for DosR binding in Mycobacterium tuberculosis and to existing data for GABP in humans and show that it can discriminate binding sites separated by as few as 40 bp.


Assuntos
Imunoprecipitação da Cromatina/estatística & dados numéricos , Biologia Computacional/métodos , Software , Fatores de Transcrição/metabolismo , Sítios de Ligação/genética , Humanos , Mycobacterium tuberculosis/genética , Fatores de Transcrição/genética
16.
Methods Mol Biol ; 521: 255-78, 2009.
Artigo em Inglês | MEDLINE | ID: mdl-19563111

RESUMO

Chromatin immunoprecipitation (ChIP) is a widely used method to study the interactions between proteins and discrete chromosomal loci in vivo. Originally, ChIP was developed for analysis of protein associations with DNA sequences known or suspected to bind the protein of interest. The advent of DNA microarrays has enabled the identification of all DNA sequences enriched by ChIP, providing a genomic view of protein binding. This powerful approach, termed ChIP-chip, is broadly applicable and has been particularly valuable in DNA replication studies to map replication origins in Saccharomyces cerevisiae based on the association of replication proteins with these chromosomal elements. We present a detailed ChIP-chip protocol for S. cerevisiae that uses oligonucleotide DNA microarrays printed on polylysine-coated glass slides and can also be easily adapted for commercially available high-density tiling microarrays from NimbleGen. We also outline general protocols for data analysis; however, microarray data analyses usually must be tailored specifically for individual studies, depending on experimental design, microarray format, and data quality.


Assuntos
Imunoprecipitação da Cromatina/métodos , Cromatina/metabolismo , Replicação do DNA , Proteínas de Ligação a DNA/metabolismo , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Imunoprecipitação da Cromatina/estatística & dados numéricos , Reagentes de Ligações Cruzadas , DNA Fúngico/biossíntese , DNA Fúngico/isolamento & purificação , Interpretação Estatística de Dados , Corantes Fluorescentes , Hibridização de Ácido Nucleico , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Origem de Replicação , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo
17.
Biometrics ; 65(4): 1087-95, 2009 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-19210737

RESUMO

We propose a unified framework for the analysis of chromatin (Ch) immunoprecipitation (IP) microarray (ChIP-chip) data for detecting transcription factor binding sites (TFBSs) or motifs. ChIP-chip assays are used to focus the genome-wide search for TFBSs by isolating a sample of DNA fragments with TFBSs and applying this sample to a microarray with probes corresponding to tiled segments across the genome. Present analytical methods use a two-step approach: (i) analyze array data to estimate IP-enrichment peaks then (ii) analyze the corresponding sequences independently of intensity information. The proposed model integrates peak finding and motif discovery through a unified Bayesian hidden Markov model (HMM) framework that accommodates the inherent uncertainty in both measurements. A Markov chain Monte Carlo algorithm is formulated for parameter estimation, adapting recursive techniques used for HMMs. In simulations and applications to a yeast RAP1 dataset, the proposed method has favorable TFBS discovery performance compared to currently available two-stage procedures in terms of both sensitivity and specificity.


Assuntos
Biometria/métodos , Imunoprecipitação da Cromatina/estatística & dados numéricos , Genômica/estatística & dados numéricos , Modelos Estatísticos , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Algoritmos , Sequência de Bases , Teorema de Bayes , Sítios de Ligação/genética , DNA Fúngico/genética , DNA Fúngico/metabolismo , Cadeias de Markov , Método de Monte Carlo , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo , Complexo Shelterina , Proteínas de Ligação a Telômeros/metabolismo , Fatores de Transcrição/metabolismo
18.
PLoS Comput Biol ; 4(10): e1000201, 2008 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-18927605

RESUMO

Computational methods to identify functional genomic elements using genetic information have been very successful in determining gene structure and in identifying a handful of cis-regulatory elements. But the vast majority of regulatory elements have yet to be discovered, and it has become increasingly apparent that their discovery will not come from using genetic information alone. Recently, high-throughput technologies have enabled the creation of information-rich epigenetic maps, most notably for histone modifications. However, tools that search for functional elements using this epigenetic information have been lacking. Here, we describe an unsupervised learning method called ChromaSig to find, in an unbiased fashion, commonly occurring chromatin signatures in both tiling microarray and sequencing data. Applying this algorithm to nine chromatin marks across a 1% sampling of the human genome in HeLa cells, we recover eight clusters of distinct chromatin signatures, five of which correspond to known patterns associated with transcriptional promoters and enhancers. Interestingly, we observe that the distinct chromatin signatures found at enhancers mark distinct functional classes of enhancers in terms of transcription factor and coactivator binding. In addition, we identify three clusters of novel chromatin signatures that contain evolutionarily conserved sequences and potential cis-regulatory elements. Applying ChromaSig to a panel of 21 chromatin marks mapped genomewide by ChIP-Seq reveals 16 classes of genomic elements marked by distinct chromatin signatures. Interestingly, four classes containing enrichment for repressive histone modifications appear to be locally heterochromatic sites and are enriched in quickly evolving regions of the genome. The utility of this approach in uncovering novel, functionally significant genomic elements will aid future efforts of genome annotation via chromatin modifications.


Assuntos
Cromatina/genética , Genoma Humano , Modelos Genéticos , Modelos Estatísticos , Inteligência Artificial , Cromatina/metabolismo , Imunoprecipitação da Cromatina/estatística & dados numéricos , Biologia Computacional , Elementos Facilitadores Genéticos , Células HeLa , Histonas/química , Histonas/genética , Histonas/metabolismo , Humanos , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Regiões Promotoras Genéticas , Processamento de Proteína Pós-Traducional , Sítio de Iniciação de Transcrição
19.
Curr Opin Biotechnol ; 19(1): 50-4, 2008 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-18207385

RESUMO

Changes in transcript levels are assessed by microarray analysis on an individual basis, essentially resulting in long lists of genes that were found to have significantly changed transcript levels. However, in biology these changes do not occur as independent events as such lists suggest, but in a highly coordinated and interdependent manner. Understanding the biological meaning of the observed changes requires elucidating such biological interdependencies. The most common way to achieve this is to project the gene lists onto distinct biological processes often represented in the form of gene-ontology (GO) categories or metabolic and regulatory pathways as derived from literature analysis. This review focuses on different approaches and tools employed for this task, starting form GO-ranking methods, covering pathway mappings, finally converging on biological network analysis. A brief outlook of the application of such approaches to the newest microarray-based technologies (Chromatin-ImmunoPrecipitation, ChIP-on-chip) concludes the review.


Assuntos
Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Biotecnologia , Imunoprecipitação da Cromatina/estatística & dados numéricos , Biologia Computacional , Interpretação Estatística de Dados , Bases de Dados Genéticas
20.
Pac Symp Biocomput ; : 515-26, 2008.
Artigo em Inglês | MEDLINE | ID: mdl-18229712

RESUMO

Whole genome tiling arrays at a user specified resolution are becoming a versatile tool in genomics. Chromatin immunoprecipitation on microarrays (ChIP-chip) is a powerful application of these arrays. Although there is an increasing number of methods for analyzing ChIP-chip data, perhaps the most simple and commonly used one, due to its computational efficiency, is testing with a moving average statistic. Current moving average methods assume exchangeability of the measurements within an array. They are not tailored to deal with the issues due to array designs such as overlapping probes that result in correlated measurements. We investigate the correlation structure of data from such arrays and propose an extension of the moving average testing via a robust and rapid method called CMARRT. We illustrate the pitfalls of ignoring the correlation structure in simulations and a case study. Our approach is implemented as an R package called CMARRT and can be used with any tiling array platform.


Assuntos
Algoritmos , Imunoprecipitação da Cromatina/estatística & dados numéricos , Análise em Microsséries/estatística & dados numéricos , Biologia Computacional , Interpretação Estatística de Dados , Cadeias de Markov , Modelos Estatísticos , Análise de Regressão , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...