Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 16 de 16
Filter
1.
Nat Methods ; 17(3): 319-327, 2020 03.
Article in English | MEDLINE | ID: mdl-32042188

ABSTRACT

Mapping open chromatin regions has emerged as a widely used tool for identifying active regulatory elements in eukaryotes. However, existing approaches, limited by reliance on DNA fragmentation and short-read sequencing, cannot provide information about large-scale chromatin states or reveal coordination between the states of distal regulatory elements. We have developed a method for profiling the accessibility of individual chromatin fibers, a single-molecule long-read accessible chromatin mapping sequencing assay (SMAC-seq), enabling the simultaneous, high-resolution, single-molecule assessment of chromatin states at multikilobase length scales. Our strategy is based on combining the preferential methylation of open chromatin regions by DNA methyltransferases with low sequence specificity, in this case EcoGII, an N6-methyladenosine (m6A) methyltransferase, and the ability of nanopore sequencing to directly read DNA modifications. We demonstrate that aggregate SMAC-seq signals match bulk-level accessibility measurements, observe single-molecule nucleosome and transcription factor protection footprints, and quantify the correlation between chromatin states of distal genomic elements.


Subject(s)
Chromatin/chemistry , DNA Fragmentation , Saccharomyces cerevisiae/chemistry , Adenosine/analogs & derivatives , Adenosine/chemistry , Cell Line , Chromatin Immunoprecipitation , CpG Islands , DNA Methylation , High-Throughput Nucleotide Sequencing , Humans , Methylation , Methyltransferases/genetics , Nucleosomes/chemistry , Promoter Regions, Genetic , Protein Binding
2.
Neuron ; 103(3): 412-422.e4, 2019 08 07.
Article in English | MEDLINE | ID: mdl-31221560

ABSTRACT

Selective synaptic and axonal degeneration are critical aspects of both brain development and neurodegenerative disease. Inhibition of caspase signaling in neurons is a potential therapeutic strategy for neurodegenerative disease, but no neuron-specific modulators of caspase signaling have been described. Using a mass spectrometry approach, we discovered that RUFY3, a neuronally enriched protein, is essential for caspase-mediated degeneration of TRKA+ sensory axons in vitro and in vivo. Deletion of Rufy3 protects axons from degeneration, even in the presence of activated CASP3 that is competent to cleave endogenous substrates. Dephosphorylation of RUFY3 at residue S34 appears required for axon degeneration, providing a potential mechanism for neurons to locally control caspase-driven degeneration. Neuronally enriched RUFY3 thus provides an entry point for understanding non-apoptotic functions of CASP3 and a potential target to modulate caspase signaling specifically in neurons for neurodegenerative disease.


Subject(s)
Axons/pathology , Nerve Degeneration/pathology , Nerve Tissue Proteins/physiology , Animals , Axons/enzymology , Caspase 3/physiology , Cells, Cultured , Cytoskeletal Proteins , Enzyme Activation , Ganglia, Spinal/cytology , Ganglia, Spinal/embryology , Mice , Mice, Knockout , Nerve Degeneration/enzymology , Nerve Tissue Proteins/chemistry , Nerve Tissue Proteins/deficiency , Phosphorylation , Protein Processing, Post-Translational , Receptor, trkA/physiology , Sensory Receptor Cells/physiology , Structure-Activity Relationship
4.
Science ; 362(6413)2018 10 26.
Article in English | MEDLINE | ID: mdl-30361340

ABSTRACT

The spatial organization of chromatin is pivotal for regulating genome functions. We report an imaging method for tracing chromatin organization with kilobase- and nanometer-scale resolution, unveiling chromatin conformation across topologically associating domains (TADs) in thousands of individual cells. Our imaging data revealed TAD-like structures with globular conformation and sharp domain boundaries in single cells. The boundaries varied from cell to cell, occurring with nonzero probabilities at all genomic positions but preferentially at CCCTC-binding factor (CTCF)- and cohesin-binding sites. Notably, cohesin depletion, which abolished TADs at the population-average level, did not diminish TAD-like structures in single cells but eliminated preferential domain boundary positions. Moreover, we observed widespread, cooperative, multiway chromatin interactions, which remained after cohesin depletion. These results provide critical insight into the mechanisms underlying chromatin domain and hub formation.


Subject(s)
Chromatin/chemistry , Single-Cell Analysis/methods , CCCTC-Binding Factor/chemistry , Cell Cycle Proteins/chemistry , Chromatin/ultrastructure , Chromosomal Proteins, Non-Histone/chemistry , Genome, Human , HCT116 Cells , Humans , In Situ Hybridization, Fluorescence , Protein Binding , Protein Domains , Cohesins
5.
Hum Mol Genet ; 27(24): 4194-4203, 2018 12 15.
Article in English | MEDLINE | ID: mdl-30169630

ABSTRACT

Great strides in gene discovery have been made using a multitude of methods to associate phenotypes with genetic variants, but there still remains a substantial gap between observed symptoms and identified genetic defects. Herein, we use the convergence of various genetic and genomic techniques to investigate the underpinnings of a constellation of phenotypes that include prostate cancer (PCa) and sensorineural hearing loss (SNHL) in a human subject. Through interrogation of the subject's de novo, germline, balanced chromosomal translocation, we first identify a correlation between his disorders and a poorly annotated gene known as lipid droplet associated hydrolase (LDAH). Using data repositories of both germline and somatic variants, we identify convergent genomic evidence that substantiates a correlation between loss of LDAH and PCa. This correlation is validated through both in vitro and in vivo models that show loss of LDAH results in increased risk of PCa and, to a lesser extent, SNHL. By leveraging convergent evidence in emerging genomic data, we hypothesize that loss of LDAH is involved in PCa and other phenotypes observed in support of a genotype-phenotype association in an n-of-one human subject.


Subject(s)
Hearing Loss, Sensorineural/genetics , Prostatic Neoplasms/genetics , Serine Proteases/genetics , Translocation, Genetic/genetics , Adult , Aged , Animals , Genome-Wide Association Study , Germ Cells/pathology , Hearing Loss, Sensorineural/pathology , Humans , Male , Mice , Mice, Knockout , Phenotype , Prostatic Neoplasms/pathology
6.
Nucleic Acids Res ; 46(7): e42, 2018 04 20.
Article in English | MEDLINE | ID: mdl-29361139

ABSTRACT

Much of the within species genetic variation is in the form of single nucleotide polymorphisms (SNPs), typically detected by whole genome sequencing (WGS) or microarray-based technologies. However, WGS produces mostly uninformative reads that perfectly match the reference, while microarrays require genome-specific reagents. We have developed Diff-seq, a sequencing-based mismatch detection assay for SNP discovery without the requirement for specialized nucleic-acid reagents. Diff-seq leverages the Surveyor endonuclease to cleave mismatched DNA molecules that are generated after cross-annealing of a complex pool of DNA fragments. Sequencing libraries enriched for Surveyor-cleaved molecules result in increased coverage at the variant sites. Diff-seq detected all mismatches present in an initial test substrate, with specific enrichment dependent on the identity and context of the variation. Application to viral sequences resulted in increased observation of variant alleles in a biologically relevant context. Diff-Seq has the potential to increase the sensitivity and efficiency of high-throughput sequencing in the detection of variation.


Subject(s)
Base Pair Mismatch/genetics , High-Throughput Nucleotide Sequencing/methods , Polymorphism, Single Nucleotide/genetics , Sequence Analysis, DNA/methods , Alleles , DNA Fragmentation , Genome/genetics , Genome, Viral/genetics , HIV/genetics , HIV Integrase/genetics , HIV Reverse Transcriptase/genetics , Humans , Whole Genome Sequencing , pol Gene Products, Human Immunodeficiency Virus/genetics
7.
NPJ Genom Med ; 3: 1, 2018.
Article in English | MEDLINE | ID: mdl-29354286

ABSTRACT

Cancer develops by accumulation of somatic driver mutations, which impact cellular function. Mutations in non-coding regulatory regions can now be studied genome-wide and further characterized by correlation with gene expression and clinical outcome to identify driver candidates. Using a new two-stage procedure, called ncDriver, we first screened 507 ICGC whole-genomes from 10 cancer types for non-coding elements, in which mutations are both recurrent and have elevated conservation or cancer specificity. This identified 160 significant non-coding elements, including the TERT promoter, a well-known non-coding driver element, as well as elements associated with known cancer genes and regulatory genes (e.g., PAX5, TOX3, PCF11, MAPRE3). However, in some significant elements, mutations appear to stem from localized mutational processes rather than recurrent positive selection in some cases. To further characterize the driver potential of the identified elements and shortlist candidates, we identified elements where presence of mutations correlated significantly with expression levels (e.g., TERT and CDH10) and survival (e.g., CDH9 and CDH10) in an independent set of 505 TCGA whole-genome samples. In a larger pan-cancer set of 4128 TCGA exomes with expression profiling, we identified mutational correlation with expression for additional elements (e.g., near GATA3, CDC6, ZNF217, and CTCF transcription factor binding sites). Survival analysis further pointed to MIR122, a known marker of poor prognosis in liver cancer. In conclusion, the screen for significant mutation patterns coupled with correlative mutational analysis identified new individual driver candidates and suggest that some non-coding mutations recurrently affect expression and play a role in cancer development.

8.
Nat Methods ; 14(10): 959-962, 2017 Oct.
Article in English | MEDLINE | ID: mdl-28846090

ABSTRACT

We present Omni-ATAC, an improved ATAC-seq protocol for chromatin accessibility profiling that works across multiple applications with substantial improvement of signal-to-background ratio and information content. The Omni-ATAC protocol generates chromatin accessibility profiles from archival frozen tissue samples and 50-µm sections, revealing the activities of disease-associated DNA elements in distinct human brain structures. The Omni-ATAC protocol enables the interrogation of personal regulomes in tissue context and translational studies.


Subject(s)
DNA/genetics , Freezing , Genome , Specimen Handling/methods , Animals , Brain , Cell Line , Erythrocytes , Gene Expression Regulation, Enzymologic , Genome-Wide Association Study , Humans , Keratinocytes , Mice , Self-Sustained Sequence Replication , Thyroid Neoplasms , Transposases/metabolism
9.
Hum Mutat ; 38(9): 1240-1250, 2017 09.
Article in English | MEDLINE | ID: mdl-28220625

ABSTRACT

In many human diseases, associated genetic changes tend to occur within noncoding regions, whose effect might be related to transcriptional control. A central goal in human genetics is to understand the function of such noncoding regions: given a region that is statistically associated with changes in gene expression (expression quantitative trait locus [eQTL]), does it in fact play a regulatory role? And if so, how is this role "coded" in its sequence? These questions were the subject of the Critical Assessment of Genome Interpretation eQTL challenge. Participants were given a set of sequences that flank eQTLs in humans and were asked to predict whether these are capable of regulating transcription (as evaluated by massively parallel reporter assays), and whether this capability changes between alternative alleles. Here, we report lessons learned from this community effort. By inspecting predictive properties in isolation, and conducting meta-analysis over the competing methods, we find that using chromatin accessibility and transcription factor binding as features in an ensemble of classifiers or regression models leads to the most accurate results. We then characterize the loci that are harder to predict, putting the spotlight on areas of weakness, which we expect to be the subject of future studies.


Subject(s)
Computational Biology/methods , Gene Expression , Gene Expression Regulation , Genetic Predisposition to Disease , Humans , Quantitative Trait Loci
10.
Nat Genet ; 48(10): 1260-6, 2016 10.
Article in English | MEDLINE | ID: mdl-27571262

ABSTRACT

Sustained expression of the estrogen receptor-α (ESR1) drives two-thirds of breast cancer and defines the ESR1-positive subtype. ESR1 engages enhancers upon estrogen stimulation to establish an oncogenic expression program. Somatic copy number alterations involving the ESR1 gene occur in approximately 1% of ESR1-positive breast cancers, suggesting that other mechanisms underlie the persistent expression of ESR1. We report significant enrichment of somatic mutations within the set of regulatory elements (SRE) regulating ESR1 in 7% of ESR1-positive breast cancers. These mutations regulate ESR1 expression by modulating transcription factor binding to the DNA. The SRE includes a recurrently mutated enhancer whose activity is also affected by rs9383590, a functional inherited single-nucleotide variant (SNV) that accounts for several breast cancer risk-associated loci. Our work highlights the importance of considering the combinatorial activity of regulatory elements as a single unit to delineate the impact of noncoding genetic alterations on single genes in cancer.


Subject(s)
Breast Neoplasms/genetics , Estrogen Receptor alpha/genetics , Mutation , Polymorphism, Single Nucleotide , CRISPR-Cas Systems , Cell Line, Tumor , Female , Gene Expression Regulation, Neoplastic , Humans , MCF-7 Cells , Regulatory Sequences, Nucleic Acid , Transcription Factors/metabolism
11.
Nat Genet ; 48(4): 374-86, 2016 Apr.
Article in English | MEDLINE | ID: mdl-26928228

ABSTRACT

We analyzed 3,872 common genetic variants across the ESR1 locus (encoding estrogen receptor α) in 118,816 subjects from three international consortia. We found evidence for at least five independent causal variants, each associated with different phenotype sets, including estrogen receptor (ER(+) or ER(-)) and human ERBB2 (HER2(+) or HER2(-)) tumor subtypes, mammographic density and tumor grade. The best candidate causal variants for ER(-) tumors lie in four separate enhancer elements, and their risk alleles reduce expression of ESR1, RMND1 and CCDC170, whereas the risk alleles of the strongest candidates for the remaining independent causal variant disrupt a silencer element and putatively increase ESR1 and RMND1 expression.


Subject(s)
Breast Neoplasms/genetics , Carrier Proteins/genetics , Cell Cycle Proteins/genetics , Chromosomes, Human, Pair 6/genetics , Estrogen Receptor alpha/genetics , Base Sequence , Breast Neoplasms/metabolism , Carrier Proteins/metabolism , Cell Cycle Proteins/metabolism , Estrogen Receptor alpha/metabolism , Female , Gene Expression , Gene Expression Regulation, Neoplastic , Genetic Association Studies , Genetic Predisposition to Disease , Humans , Phenotype , Polymorphism, Single Nucleotide , Protein Binding , Risk Factors
12.
BioData Min ; 5(1): 16, 2012 Oct 01.
Article in English | MEDLINE | ID: mdl-23025260

ABSTRACT

BACKGROUND: Geneticists who look beyond single locus disease associations require additional strategies for the detection of complex multi-locus effects. Epistasis, a multi-locus masking effect, presents a particular challenge, and has been the target of bioinformatic development. Thorough evaluation of new algorithms calls for simulation studies in which known disease models are sought. To date, the best methods for generating simulated multi-locus epistatic models rely on genetic algorithms. However, such methods are computationally expensive, difficult to adapt to multiple objectives, and unlikely to yield models with a precise form of epistasis which we refer to as pure and strict. Purely and strictly epistatic models constitute the worst-case in terms of detecting disease associations, since such associations may only be observed if all n-loci are included in the disease model. This makes them an attractive gold standard for simulation studies considering complex multi-locus effects. RESULTS: We introduce GAMETES, a user-friendly software package and algorithm which generates complex biallelic single nucleotide polymorphism (SNP) disease models for simulation studies. GAMETES rapidly and precisely generates random, pure, strict n-locus models with specified genetic constraints. These constraints include heritability, minor allele frequencies of the SNPs, and population prevalence. GAMETES also includes a simple dataset simulation strategy which may be utilized to rapidly generate an archive of simulated datasets for given genetic models. We highlight the utility and limitations of GAMETES with an example simulation study using MDR, an algorithm designed to detect epistasis. CONCLUSIONS: GAMETES is a fast, flexible, and precise tool for generating complex n-locus models with random architectures. While GAMETES has a limited ability to generate models with higher heritabilities, it is proficient at generating the lower heritability models typically used in simulation studies evaluating new algorithms. In addition, the GAMETES modeling strategy may be flexibly combined with any dataset simulation strategy. Beyond dataset simulation, GAMETES could be employed to pursue theoretical characterization of genetic models and epistasis.

13.
BMC Bioinformatics ; 12: 364, 2011 Sep 12.
Article in English | MEDLINE | ID: mdl-21910885

ABSTRACT

BACKGROUND: Epistasis is recognized ubiquitous in the genetic architecture of complex traits such as disease susceptibility. Experimental studies in model organisms have revealed extensive evidence of biological interactions among genes. Meanwhile, statistical and computational studies in human populations have suggested non-additive effects of genetic variation on complex traits. Although these studies form a baseline for understanding the genetic architecture of complex traits, to date they have only considered interactions among a small number of genetic variants. Our goal here is to use network science to determine the extent to which non-additive interactions exist beyond small subsets of genetic variants. We infer statistical epistasis networks to characterize the global space of pairwise interactions among approximately 1500 Single Nucleotide Polymorphisms (SNPs) spanning nearly 500 cancer susceptibility genes in a large population-based study of bladder cancer. RESULTS: The statistical epistasis network was built by linking pairs of SNPs if their pairwise interactions were stronger than a systematically derived threshold. Its topology clearly differentiated this real-data network from networks obtained from permutations of the same data under the null hypothesis that no association exists between genotype and phenotype. The network had a significantly higher number of hub SNPs and, interestingly, these hub SNPs were not necessarily with high main effects. The network had a largest connected component of 39 SNPs that was absent in any other permuted-data networks. In addition, the vertex degrees of this network were distinctively found following an approximate power-law distribution and its topology appeared scale-free. CONCLUSIONS: In contrast to many existing techniques focusing on high main-effect SNPs or models of several interacting SNPs, our network approach characterized a global picture of gene-gene interactions in a population-based genetic data. The network was built using pairwise interactions, and its distinctive network topology and large connected components indicated joint effects in a large set of SNPs. Our observations suggested that this particular statistical epistasis network captured important features of the genetic architecture of bladder cancer that have not been described previously.


Subject(s)
Epistasis, Genetic , Polymorphism, Single Nucleotide , Urinary Bladder Neoplasms/genetics , Adult , Aged , Genotype , Humans , Middle Aged , New Hampshire
14.
Interdiscip Sci ; 2(3): 213-20, 2010 Sep.
Article in English | MEDLINE | ID: mdl-20658333

ABSTRACT

Advances in the video gaming industry have led to the production of low-cost, high-performance graphics processing units (GPUs) that possess more memory bandwidth and computational capability than central processing units (CPUs), the standard workhorses of scientific computing. With the recent release of generalpurpose GPUs and NVIDIA's GPU programming language, CUDA, graphics engines are being adopted widely in scientific computing applications, particularly in the fields of computational biology and bioinformatics. The goal of this article is to concisely present an introduction to GPU hardware and programming, aimed at the computational biologist or bioinformaticist. To this end, we discuss the primary differences between GPU and CPU architecture, introduce the basics of the CUDA programming language, and discuss important CUDA programming practices, such as the proper use of coalesced reads, data types, and memory hierarchies. We highlight each of these topics in the context of computing the all-pairs distance between instances in a dataset, a common procedure in numerous disciplines of scientific computing. We conclude with a runtime analysis of the GPU and CPU implementations of the all-pairs distance calculation. We show our final GPU implementation to outperform the CPU implementation by a factor of 1700.


Subject(s)
Computational Biology/methods , Computer Graphics , Computers , Software , Language
15.
Bioinformatics ; 26(5): 694-5, 2010 Mar 01.
Article in English | MEDLINE | ID: mdl-20081222

ABSTRACT

MOTIVATION: Epistasis, the presence of gene-gene interactions, has been hypothesized to be at the root of many common human diseases, but current genome-wide association studies largely ignore its role. Multifactor dimensionality reduction (MDR) is a powerful model-free method for detecting epistatic relationships between genes, but computational costs have made its application to genome-wide data difficult. Graphics processing units (GPUs), the hardware responsible for rendering computer games, are powerful parallel processors. Using GPUs to run MDR on a genome-wide dataset allows for statistically rigorous testing of epistasis. RESULTS: The implementation of MDR for GPUs (MDRGPU) includes core features of the widely used Java software package, MDR. This GPU implementation allows for large-scale analysis of epistasis at a dramatically lower cost than the standard CPU-based implementations. As a proof-of-concept, we applied this software to a genome-wide study of sporadic amyotrophic lateral sclerosis (ALS). We discovered a statistically significant two-SNP classifier and subsequently replicated the significance of these two SNPs in an independent study of ALS. MDRGPU makes the large-scale analysis of epistasis tractable and opens the door to statistically rigorous testing of interactions in genome-wide datasets. AVAILABILITY: MDRGPU is open source and available free of charge from http://www.sourceforge.net/projects/mdr.


Subject(s)
Amyotrophic Lateral Sclerosis/genetics , Epistasis, Genetic , Genome-Wide Association Study/methods , Databases, Genetic , Genome, Human , Genomics/methods , Humans , Polymorphism, Single Nucleotide
16.
BMC Res Notes ; 2: 149, 2009 Jul 24.
Article in English | MEDLINE | ID: mdl-19630950

ABSTRACT

BACKGROUND: Human geneticists are now capable of measuring more than one million DNA sequence variations from across the human genome. The new challenge is to develop computationally feasible methods capable of analyzing these data for associations with common human disease, particularly in the context of epistasis. Epistasis describes the situation where multiple genes interact in a complex non-linear manner to determine an individual's disease risk and is thought to be ubiquitous for common diseases. Multifactor Dimensionality Reduction (MDR) is an algorithm capable of detecting epistasis. An exhaustive analysis with MDR is often computationally expensive, particularly for high order interactions. This challenge has previously been met with parallel computation and expensive hardware. The option we examine here exploits commodity hardware designed for computer graphics. In modern computers Graphics Processing Units (GPUs) have more memory bandwidth and computational capability than Central Processing Units (CPUs) and are well suited to this problem. Advances in the video game industry have led to an economy of scale creating a situation where these powerful components are readily available at very low cost. Here we implement and evaluate the performance of the MDR algorithm on GPUs. Of primary interest are the time required for an epistasis analysis and the price to performance ratio of available solutions. FINDINGS: We found that using MDR on GPUs consistently increased performance per machine over both a feature rich Java software package and a C++ cluster implementation. The performance of a GPU workstation running a GPU implementation reduces computation time by a factor of 160 compared to an 8-core workstation running the Java implementation on CPUs. This GPU workstation performs similarly to 150 cores running an optimized C++ implementation on a Beowulf cluster. Furthermore this GPU system provides extremely cost effective performance while leaving the CPU available for other tasks. The GPU workstation containing three GPUs costs $2000 while obtaining similar performance on a Beowulf cluster requires 150 CPU cores which, including the added infrastructure and support cost of the cluster system, cost approximately $82,500. CONCLUSION: Graphics hardware based computing provides a cost effective means to perform genetic analysis of epistasis using MDR on large datasets without the infrastructure of a computing cluster.

SELECTION OF CITATIONS
SEARCH DETAIL
...