Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 12 de 12
Filter
Add more filters










Publication year range
1.
NAR Genom Bioinform ; 5(2): lqad054, 2023 Jun.
Article in English | MEDLINE | ID: mdl-37274120

ABSTRACT

Chromatin accessibility assays have revolutionized the field of transcription regulation by providing single-nucleotide resolution measurements of regulatory features such as promoters and transcription factor binding sites. ATAC-seq directly measures how well the Tn5 transposase accesses chromatinized DNA. Tn5 has a complex sequence bias that is not effectively scaled with traditional bias-correction methods. We model this complex bias using a rule ensemble machine learning approach that integrates information from many input k-mers proximal to the ATAC sequence reads. We effectively characterize and correct single-nucleotide sequence biases and regional sequence biases of the Tn5 enzyme. Correction of enzymatic sequence bias is an important step in interpreting chromatin accessibility assays that aim to infer transcription factor binding and regulatory activity of elements in the genome.

3.
Nat Ecol Evol ; 2(3): 537-548, 2018 03.
Article in English | MEDLINE | ID: mdl-29379187

ABSTRACT

How evolutionary changes at enhancers affect the transcription of target genes remains an important open question. Previous comparative studies of gene expression have largely measured the abundance of messenger RNA, which is affected by post-transcriptional regulatory processes, hence limiting inferences about the mechanisms underlying expression differences. Here, we directly measured nascent transcription in primate species, allowing us to separate transcription from post-transcriptional regulation. We used precision run-on and sequencing to map RNA polymerases in resting and activated CD4+ T cells in multiple human, chimpanzee and rhesus macaque individuals, with rodents as outgroups. We observed general conservation in coding and non-coding transcription, punctuated by numerous differences between species, particularly at distal enhancers and non-coding RNAs. Genes regulated by larger numbers of enhancers are more frequently transcribed at evolutionarily stable levels, despite reduced conservation at individual enhancers. Adaptive nucleotide substitutions are associated with lineage-specific transcription and at one locus, SGPP2, we predict and experimentally validate that multiple substitutions contribute to human-specific transcription. Collectively, our findings suggest a pervasive role for evolutionary compensation across ensembles of enhancers that jointly regulate target genes.


Subject(s)
Macaca mulatta/genetics , Pan troglodytes/genetics , Regulatory Elements, Transcriptional , T-Lymphocytes/metabolism , Transcription, Genetic , Animals , Gene Expression , Humans , Macaca mulatta/metabolism , Male , Pan troglodytes/metabolism
4.
Nucleic Acids Res ; 46(2): e9, 2018 01 25.
Article in English | MEDLINE | ID: mdl-29126307

ABSTRACT

Coupling molecular biology to high-throughput sequencing has revolutionized the study of biology. Molecular genomics techniques are continually refined to provide higher resolution mapping of nucleic acid interactions and structure. Sequence preferences of enzymes can interfere with the accurate interpretation of these data. We developed seqOutBias to characterize enzymatic sequence bias from experimental data and scale individual sequence reads to correct intrinsic enzymatic sequence biases. SeqOutBias efficiently corrects DNase-seq, TACh-seq, ATAC-seq, MNase-seq and PRO-seq data. We show that seqOutBias correction facilitates identification of true molecular signatures resulting from transcription factors and RNA polymerase interacting with DNA.


Subject(s)
Algorithms , Computational Biology/methods , DNA/metabolism , Deoxyribonucleases/metabolism , Genomics/methods , High-Throughput Nucleotide Sequencing/methods , Bias , DNA/chemistry , DNA/genetics , DNA-Directed RNA Polymerases/genetics , DNA-Directed RNA Polymerases/metabolism , Deoxyribonucleases/genetics , Protein Binding , Reproducibility of Results , Transcription Factors/genetics , Transcription Factors/metabolism
5.
Bioinformatics ; 32(19): 3024-6, 2016 10 01.
Article in English | MEDLINE | ID: mdl-27288497

ABSTRACT

UNLABELLED: Transcription factors (TFs) regulate complex programs of gene transcription by binding to short DNA sequence motifs. Here, we introduce rtfbsdb, a unified framework that integrates a database of more than 65 000 TF binding motifs with tools to easily and efficiently scan target genome sequences. Rtfbsdb clusters motifs with similar DNA sequence specificities and integrates RNA-seq or PRO-seq data to restrict analyses to motifs recognized by TFs expressed in the cell type of interest. Our package allows common analyses to be performed rapidly in an integrated environment. AVAILABILITY AND IMPLEMENTATION: rtfbsdb available at (https://github.com/Danko-Lab/rtfbs_db). CONTACT: dankoc@gmail.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Binding Sites , Transcription Factors , Animals , Computational Biology , Genome , Humans , Nucleotide Motifs , Protein Binding , Transcription Factors/chemistry
6.
PLoS Genet ; 11(3): e1005108, 2015 Mar.
Article in English | MEDLINE | ID: mdl-25815464

ABSTRACT

Previous studies have shown that GAGA Factor (GAF) is enriched on promoters with paused RNA Polymerase II (Pol II), but its genome-wide function and mechanism of action remain largely uncharacterized. We assayed the levels of transcriptionally-engaged polymerase using global run-on sequencing (GRO-seq) in control and GAF-RNAi Drosophila S2 cells and found promoter-proximal polymerase was significantly reduced on a large subset of paused promoters where GAF occupancy was reduced by knock down. These promoters show a dramatic increase in nucleosome occupancy upon GAF depletion. These results, in conjunction with previous studies showing that GAF directly interacts with nucleosome remodelers, strongly support a model where GAF directs nucleosome displacement at the promoter and thereby allows the entry Pol II to the promoter and pause sites. This action of GAF on nucleosomes is at least partially independent of paused Pol II because intergenic GAF binding sites with little or no Pol II also show GAF-dependent nucleosome displacement. In addition, the insulator factor BEAF, the BEAF-interacting protein Chriz, and the transcription factor M1BP are strikingly enriched on those GAF-associated genes where pausing is unaffected by knock down, suggesting insulators or the alternative promoter-associated factor M1BP protect a subset of GAF-bound paused genes from GAF knock-down effects. Thus, GAF binding at promoters can lead to the local displacement of nucleosomes, but this activity can be restricted or compensated for when insulator protein or M1BP complexes also reside at GAF bound promoters.


Subject(s)
DNA-Binding Proteins/genetics , Drosophila Proteins/genetics , RNA Polymerase II/genetics , Transcription Factors/genetics , Transcription, Genetic , Animals , Binding Sites , DNA-Binding Proteins/metabolism , Drosophila Proteins/metabolism , Drosophila melanogaster , Eye Proteins/genetics , Eye Proteins/metabolism , Gene Knockdown Techniques , Nucleosomes/genetics , Nucleosomes/metabolism , Promoter Regions, Genetic , RNA Polymerase II/metabolism , Transcription Factors/metabolism
7.
Nat Methods ; 12(5): 433-8, 2015 May.
Article in English | MEDLINE | ID: mdl-25799441

ABSTRACT

Modifications to the global run-on and sequencing (GRO-seq) protocol that enrich for 5'-capped RNAs can be used to reveal active transcriptional regulatory elements (TREs) with high accuracy. Here, we introduce discriminative regulatory-element detection from GRO-seq (dREG), a sensitive machine learning method that uses support vector regression to identify active TREs from GRO-seq data without requiring cap-based enrichment (https://github.com/Danko-Lab/dREG/). This approach allows TREs to be assayed together with gene expression levels and other transcriptional features in a single experiment. Predicted TREs are more enriched for several marks of transcriptional activation­including expression quantitative trait loci, disease-associated polymorphisms, acetylated histone 3 lysine 27 (H3K27ac) and transcription factor binding­than those identified by alternative functional assays. Using dREG, we surveyed TREs in eight human cell types and provide new insights into global patterns of TRE function.


Subject(s)
Artificial Intelligence , Gene Expression Regulation/physiology , Regulatory Elements, Transcriptional/physiology , Cell Line , Genome-Wide Association Study , Histones , Humans , K562 Cells , Polymorphism, Single Nucleotide , Quantitative Trait Loci , Regulatory Elements, Transcriptional/genetics , Software
8.
Nat Genet ; 46(12): 1311-20, 2014 Dec.
Article in English | MEDLINE | ID: mdl-25383968

ABSTRACT

Despite the conventional distinction between them, promoters and enhancers share many features in mammals, including divergent transcription and similar modes of transcription factor binding. Here we examine the architecture of transcription initiation through comprehensive mapping of transcription start sites (TSSs) in human lymphoblastoid B cell (GM12878) and chronic myelogenous leukemic (K562) ENCODE Tier 1 cell lines. Using a nuclear run-on protocol called GRO-cap, which captures TSSs for both stable and unstable transcripts, we conduct detailed comparisons of thousands of promoters and enhancers in human cells. These analyses identify a common architecture of initiation, including tightly spaced (110 bp apart) divergent initiation, similar frequencies of core promoter sequence elements, highly positioned flanking nucleosomes and two modes of transcription factor binding. Post-initiation transcript stability provides a more fundamental distinction between promoters and enhancers than patterns of histone modification and association of transcription factors or co-activators. These results support a unified model of transcription initiation at promoters and enhancers.


Subject(s)
Enhancer Elements, Genetic , Promoter Regions, Genetic , RNA/genetics , B-Lymphocytes/cytology , Binding Sites , Chromatin/chemistry , Histones/chemistry , Humans , K562 Cells , Markov Chains , Models, Genetic , Nucleosomes/chemistry , RNA Splicing , Regulatory Sequences, Nucleic Acid , Transcription Initiation Site , Transcription, Genetic
9.
PLoS One ; 8(10): e77175, 2013.
Article in English | MEDLINE | ID: mdl-24194868

ABSTRACT

To gain insights into evolutionary forces that have shaped the history of Bornean and Sumatran populations of orang-utans, we compare patterns of variation across more than 11 million single nucleotide polymorphisms found by previous mitochondrial and autosomal genome sequencing of 10 wild-caught orang-utans. Our analysis of the mitochondrial data yields a far more ancient split time between the two populations (~3.4 million years ago) than estimates based on autosomal data (0.4 million years ago), suggesting a complex speciation process with moderate levels of primarily male migration. We find that the distribution of selection coefficients consistent with the observed frequency spectrum of autosomal non-synonymous polymorphisms in orang-utans is similar to the distribution in humans. Our analysis indicates that 35% of genes have evolved under detectable negative selection. Overall, our findings suggest that purifying natural selection, genetic drift, and a complex demographic history are the dominant drivers of genome evolution for the two orang-utan populations.


Subject(s)
Evolution, Molecular , Genetic Drift , Genetic Speciation , Genetic Variation , Genetics, Population , Pongo/genetics , Selection, Genetic , Animal Migration , Animals , Base Sequence , Bayes Theorem , Borneo , Indonesia , Male , Models, Genetic , Molecular Sequence Annotation , Molecular Sequence Data , Phylogeny , Polymorphism, Single Nucleotide/genetics , Sequence Analysis, DNA , Species Specificity
10.
Mol Cell ; 50(2): 212-22, 2013 Apr 25.
Article in English | MEDLINE | ID: mdl-23523369

ABSTRACT

RNA polymerase II (Pol II) transcribes hundreds of kilobases of DNA, limiting the production of mRNAs and lncRNAs. We used global run-on sequencing (GRO-seq) to measure the rates of transcription by Pol II following gene activation. Elongation rates vary as much as 4-fold at different genomic loci and in response to two distinct cellular signaling pathways (i.e., 17ß-estradiol [E2] and TNF-α). The rates are slowest near the promoter and increase during the first ~15 kb transcribed. Gene body elongation rates correlate with Pol II density, resulting in systematically higher rates of transcript production at genes with higher Pol II density. Pol II dynamics following short inductions indicate that E2 stimulates gene expression by increasing Pol II initiation, whereas TNF-α reduces Pol II residence time at pause sites. Collectively, our results identify previously uncharacterized variation in the rate of transcription and highlight elongation as an important, variable, and regulated rate-limiting step during transcription.


Subject(s)
RNA Polymerase II/metabolism , RNA, Messenger/biosynthesis , Signal Transduction , Transcription Initiation, Genetic , Estradiol/pharmacology , Estradiol/physiology , Humans , Kinetics , MCF-7 Cells , Promoter Regions, Genetic , RNA Polymerase II/physiology , RNA, Messenger/genetics , Transcription Factors/metabolism , Transcription Initiation Site , Transcription, Genetic , Transcriptional Activation , Transcriptome , Tumor Necrosis Factor-alpha/pharmacology , Tumor Necrosis Factor-alpha/physiology
11.
PLoS Genet ; 8(3): e1002610, 2012.
Article in English | MEDLINE | ID: mdl-22479205

ABSTRACT

DNA sequence and local chromatin landscape act jointly to determine transcription factor (TF) binding intensity profiles. To disentangle these influences, we developed an experimental approach, called protein/DNA binding followed by high-throughput sequencing (PB-seq), that allows the binding energy landscape to be characterized genome-wide in the absence of chromatin. We applied our methods to the Drosophila Heat Shock Factor (HSF), which inducibly binds a target DNA sequence element (HSE) following heat shock stress. PB-seq involves incubating sheared naked genomic DNA with recombinant HSF, partitioning the HSF-bound and HSF-free DNA, and then detecting HSF-bound DNA by high-throughput sequencing. We compared PB-seq binding profiles with ones observed in vivo by ChIP-seq and developed statistical models to predict the observed departures from idealized binding patterns based on covariates describing the local chromatin environment. We found that DNase I hypersensitivity and tetra-acetylation of H4 were the most influential covariates in predicting changes in HSF binding affinity. We also investigated the extent to which DNA accessibility, as measured by digital DNase I footprinting data, could be predicted from MNase-seq data and the ChIP-chip profiles for many histone modifications and TFs, and found GAGA element associated factor (GAF), tetra-acetylation of H4, and H4K16 acetylation to be the most predictive covariates. Lastly, we generated an unbiased model of HSF binding sequences, which revealed distinct biophysical properties of the HSF/HSE interaction and a previously unrecognized substructure within the HSE. These findings provide new insights into the interplay between the genomic sequence and the chromatin landscape in determining transcription factor binding intensity.


Subject(s)
Chromatin , DNA-Binding Proteins , Drosophila Proteins , Drosophila melanogaster , Transcription Factors/genetics , Acetylation , Animals , Binding Sites/genetics , Chromatin/genetics , DNA-Binding Proteins/genetics , DNA-Binding Proteins/metabolism , Deoxyribonuclease I/genetics , Drosophila Proteins/genetics , Drosophila Proteins/metabolism , Drosophila melanogaster/genetics , Gene Expression Regulation , Genome, Insect , Heat Shock Transcription Factors , Heat-Shock Response/genetics , High-Throughput Nucleotide Sequencing , Histones/genetics , Histones/metabolism , Transcription Factors/metabolism , Transcriptional Activation/genetics
12.
Nature ; 478(7370): 476-82, 2011 Oct 12.
Article in English | MEDLINE | ID: mdl-21993624

ABSTRACT

The comparison of related genomes has emerged as a powerful lens for genome interpretation. Here we report the sequencing and comparative analysis of 29 eutherian genomes. We confirm that at least 5.5% of the human genome has undergone purifying selection, and locate constrained elements covering ∼4.2% of the genome. We use evolutionary signatures and comparisons with experimental data sets to suggest candidate functions for ∼60% of constrained bases. These elements reveal a small number of new coding exons, candidate stop codon readthrough events and over 10,000 regions of overlapping synonymous constraint within protein-coding exons. We find 220 candidate RNA structural families, and nearly a million elements overlapping potential promoter, enhancer and insulator regions. We report specific amino acid residues that have undergone positive selection, 280,000 non-coding elements exapted from mobile elements and more than 1,000 primate- and human-accelerated elements. Overlap with disease-associated variants indicates that our findings will be relevant for studies of human biology, health and disease.


Subject(s)
Evolution, Molecular , Genome, Human/genetics , Genome/genetics , Mammals/genetics , Animals , Disease , Exons/genetics , Genomics , Health , Humans , Molecular Sequence Annotation , Phylogeny , RNA/classification , RNA/genetics , Selection, Genetic/genetics , Sequence Alignment , Sequence Analysis, DNA
SELECTION OF CITATIONS
SEARCH DETAIL
...