Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 7 de 7
Filter
Add more filters










Database
Language
Publication year range
1.
Nat Methods ; 14(4): 417-419, 2017 Apr.
Article in English | MEDLINE | ID: mdl-28263959

ABSTRACT

We introduce Salmon, a lightweight method for quantifying transcript abundance from RNA-seq reads. Salmon combines a new dual-phase parallel inference algorithm and feature-rich bias models with an ultra-fast read mapping procedure. It is the first transcriptome-wide quantifier to correct for fragment GC-content bias, which, as we demonstrate here, substantially improves the accuracy of abundance estimates and the sensitivity of subsequent differential expression analysis.


Subject(s)
Algorithms , Sequence Analysis, RNA/methods , Base Composition , Bayes Theorem , Gene Expression Profiling/methods , Gene Expression Profiling/statistics & numerical data , Sequence Analysis, RNA/statistics & numerical data
2.
J Comput Biol ; 23(6): 425-38, 2016 06.
Article in English | MEDLINE | ID: mdl-27267775

ABSTRACT

Chromosome conformation capture (3C) experiments provide a window into the spatial packing of a genome in three dimensions within the cell. This structure has been shown to be correlated with gene regulation, cancer mutations, and other genomic functions. However, 3C provides mixed measurements on a population of typically millions of cells, each with a different genome structure due to the fluidity of the genome and differing cell states. Here, we present several algorithms to deconvolve these measured 3C matrices into estimations of the contact matrices for each subpopulation of cells and relative densities of each subpopulation. We formulate the problem as that of choosing matrices and densities that minimize the Frobenius distance between the observed 3C matrix and the weighted sum of the estimated subpopulation matrices. Results on HeLa 5C and mouse and bacteria Hi-C data demonstrate the methods' effectiveness. We also show that domain boundaries from deconvolved matrices are often more enriched or depleted for regulatory chromatin markers when compared to boundaries from convolved matrices.


Subject(s)
Bacteria/genetics , Chromatin/genetics , Computational Biology/methods , Algorithms , Animals , Chromosomes/genetics , HeLa Cells , Humans , Mice
3.
Algorithms Mol Biol ; 9: 14, 2014.
Article in English | MEDLINE | ID: mdl-24868242

ABSTRACT

Chromosome conformation capture experiments have led to the discovery of dense, contiguous, megabase-sized topological domains that are similar across cell types and conserved across species. These domains are strongly correlated with a number of chromatin markers and have since been included in a number of analyses. However, functionally-relevant domains may exist at multiple length scales. We introduce a new and efficient algorithm that is able to capture persistent domains across various resolutions by adjusting a single scale parameter. The ensemble of domains we identify allows us to quantify the degree to which the domain structure is hierarchical as opposed to overlapping, and our analysis reveals a pronounced hierarchical structure in which larger stable domains tend to completely contain smaller domains. The identified novel domains are substantially different from domains reported previously and are highly enriched for insulating factor CTCF binding and histone marks at the boundaries.

4.
Nucleic Acids Res ; 42(1): 87-96, 2014 Jan.
Article in English | MEDLINE | ID: mdl-24089144

ABSTRACT

Distal expression quantitative trait loci (distal eQTLs) are genetic mutations that affect the expression of genes genomically far away. However, the mechanisms that cause a distal eQTL to modulate gene expression are not yet clear. Recent high-resolution chromosome conformation capture experiments along with a growing database of eQTLs provide an opportunity to understand the spatial mechanisms influencing distal eQTL associations on a genome-wide scale. We test the hypothesis that spatial proximity contributes to eQTL-gene regulation in the context of the higher-order domain structure of chromatin as determined from recent Hi-C chromosome conformation experiments. This analysis suggests that the large-scale topology of chromatin is coupled with eQTL associations by providing evidence that eQTLs are in general spatially close to their target genes, occur often around topological domain boundaries and preferentially associate with genes across domains. We also find that within-domain eQTLs that overlap with regulatory elements such as promoters and enhancers are spatially more close than the overall set of within-domain eQTLs, suggesting that spatial proximity derived from the domain structure in chromatin plays an important role in the regulation of gene expression.


Subject(s)
Chromatin/chemistry , Gene Expression , Quantitative Trait Loci , Cell Line , Humans , Regulatory Sequences, Nucleic Acid
5.
Algorithms Mol Biol ; 8(1): 8, 2013 Mar 09.
Article in English | MEDLINE | ID: mdl-23497444

ABSTRACT

BACKGROUND: Chromosome structure is closely related to its function and Chromosome Conformation Capture (3C) is a widely used technique for exploring spatial properties of chromosomes. 3C interaction frequencies are usually associated with spatial distances. However, the raw data from 3C experiments is an aggregation of interactions from many cells, and the spatial distances of any given interaction are uncertain. RESULTS: We introduce a new method for filtering 3C interactions that selects subsets of interactions that obey metric constraints of various strictness. We demonstrate that, although the problem is computationally hard, near-optimal results are often attainable in practice using well-designed heuristics and approximation algorithms. Further, we show that, compared with a standard technique, this metric filtering approach leads to (a) subgraphs with higher statistical significance, (b) lower embedding error, (c) lower sensitivity to initial conditions of the embedding algorithm, and (d) structures with better agreement with light microscopy measurements. Our filtering scheme is applicable for a strict frequency-to-distance mapping and a more relaxed mapping from frequency to a range of distances. CONCLUSIONS: Our filtering method for 3C data considers both metric consistency and statistical confidence simultaneously resulting in lower-error embeddings that are biologically more plausible.

6.
BMC Bioinformatics ; 13: 241, 2012 Sep 21.
Article in English | MEDLINE | ID: mdl-22998471

ABSTRACT

BACKGROUND: Chromosome conformation capture experiments result in pairwise proximity measurements between chromosome locations in a genome, and they have been used to construct three-dimensional models of genomic regions, chromosomes, and entire genomes. These models can be used to understand long-range gene regulation, chromosome rearrangements, and the relationships between sequence and spatial location. However, it is unclear whether these pairwise distance constraints provide sufficient information to embed chromatin in three dimensions. A priori, it is possible that an infinite number of embeddings are consistent with the measurements due to a lack of constraints between some regions. It is therefore necessary to separate regions of the chromatin structure that are sufficiently constrained from regions with measurements that do not provide enough information to reconstruct the embedding. RESULTS: We present a new method based on graph rigidity to assess the suitability of experiments for constructing plausible three-dimensional models of chromatin structure. Underlying this analysis is a new, efficient, and accurate algorithm for finding sufficiently constrained (rigid) collections of constraints in three dimensions, a problem for which there is no known efficient algorithm. Applying the method to four recent chromosome conformation experiments, we find that, for even stringently filtered constraints, a large rigid component spans most of the measured region. Filtering highlights higher-confidence regions, and we find that the organization of these regions depends crucially on short-range interactions. CONCLUSIONS: Without performing an embedding or creating a frequency-to-distance mapping, our proposed approach establishes which substructures are supported by a sufficient framework of interactions. It also establishes that interactions from recent highly filtered genome-wide chromosome conformation experiments provide an adequate set of constraints for embedding. Pre-processing experimentally observed interactions with this method before relating chromatin structure to biological phenomena will ensure that hypothesized correlations are not driven by the arbitrary choice of a particular unconstrained embedding. The software for identifying rigid components is GPL-Licensed and available for download at http://cbcb.umd.edu/kingsford-group/starfish.


Subject(s)
Chromosomes/chemistry , Models, Molecular , Nucleic Acid Conformation , Algorithms , Chromatin/chemistry , Humans
7.
PLoS One ; 7(2): e31969, 2012.
Article in English | MEDLINE | ID: mdl-22393375

ABSTRACT

Various methods of reconstructing transcriptional regulatory networks infer transcriptional regulatory interactions (TRIs) between strongly coexpressed gene pairs (as determined from microarray experiments measuring mRNA levels). Alternatively, however, the coexpression of two genes might imply that they are coregulated by one or more transcription factors (TFs), and do not necessarily share a direct regulatory interaction. We explore whether and under what circumstances gene pairs with a high degree of coexpression are more likely to indicate TRIs, coregulation or both. Here we use established TRIs in combination with microarray expression data from both Escherichia coli (a prokaryote) and Saccharomyces cerevisiae (a eukaryote) to assess the accuracy of predictions of coregulated gene pairs and TRIs from coexpressed gene pairs. We find that coexpressed gene pairs are more likely to indicate coregulation than TRIs for Saccharomyces cerevisiae, but the incidence of TRIs in highly coexpressed gene pairs is higher for Escherichia coli. The data processing inequality (DPI) has previously been applied for the inference of TRIs. We consider the case where a transcription factor gene is known to regulate two genes (one of which is a transcription factor gene) that are known not to regulate one another. According to the DPI, the non-interacting gene pairs should have the smallest mutual information among all pairs in the triplets. While this is sometimes the case for Escherichia coli, we find that it is almost always not the case for Saccharomyces cerevisiae. This brings into question the usefulness of the DPI sometimes employed to infer TRIs from expression data. Finally, we observe that when a TF gene is known to regulate two other genes, it is rarely the case that one regulatory interaction is positively correlated and the other interaction is negatively correlated. Typically both are either positively or negatively correlated.


Subject(s)
Escherichia coli/genetics , Gene Expression Regulation, Bacterial , Gene Expression Regulation, Fungal , Gene Regulatory Networks , Saccharomyces cerevisiae/genetics , Amino Acid Motifs , Computational Biology/methods , Databases, Genetic , Gene Expression Profiling/methods , Genes, Bacterial , Genes, Fungal , Models, Genetic , Models, Statistical , Oligonucleotide Array Sequence Analysis , RNA, Messenger/metabolism , Species Specificity , Transcription Factors/metabolism
SELECTION OF CITATIONS
SEARCH DETAIL
...