Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 52
Filter
1.
Cell Rep Methods ; 3(9): 100578, 2023 09 25.
Article in English | MEDLINE | ID: mdl-37673071

ABSTRACT

Regulatory networks containing enhancer-gene edges define cellular states. Multiple efforts have revealed these networks for reference tissues and cell lines by integrating multi-omics data. However, the methods developed cannot be applied for large patient cohorts due to the infeasibility of chromatin immunoprecipitation sequencing (ChIP-seq) for limited biopsy material. We trained machine-learning models using chromatin interaction analysis with paired-end tag sequencing (ChIA-PET) and high-throughput chromosome conformation capture combined with chromatin immunoprecipitation (HiChIP) data that can predict connections using only assay for transposase-accessible chromatin using sequencing (ATAC-seq) and RNA-seq data as input, which can be generated from biopsies. Our method overcomes limitations of correlation-based approaches that cannot distinguish between distinct target genes of given enhancers or between active vs. poised states in different samples, a hallmark of network rewiring in cancer. Application of our model on 371 samples across 22 cancer types revealed 1,780 enhancer-gene connections for 602 cancer genes. Using CRISPR interference (CRISPRi), we validated enhancers predicted to regulate ESR1 in estrogen receptor (ER)+ breast cancer and A1CF in liver hepatocellular carcinoma.


Subject(s)
Chromatin Immunoprecipitation Sequencing , Chromatin , Humans , Chromatin/genetics , Regulatory Sequences, Nucleic Acid , RNA-Seq , Cell Line
2.
Cureus ; 15(6): e40728, 2023 Jun.
Article in English | MEDLINE | ID: mdl-37485185

ABSTRACT

Introduction Impulsivity (or impulsiveness) and risk-taking behavior are significant concerns as the adolescent population is at a higher risk of injuries and violence, unhealthy sexual behaviors, and drug- and alcohol-related problems. The early identification of these traits in adolescents can prove beneficial through timely interventions. This study was conducted to assess impulsive behavior and risk-taking behavior among school-going adolescents in New Delhi, India, and to study the association, if any, between the two. Methodology A cross-sectional study was conducted among 571 students of classes 9th-10th in three randomly selected schools in a part of Delhi, India. Barratt Impulsiveness Scale - Brief (BIS-Brief) was used to evaluate impulsivity, and risk-taking behavior was assessed using the RT-18 tool. Results The majority (72.3%) of the 571 students were aged 14-15 years. Among the students, 56.0% were males. The impulsivity score obtained ranged from 8 to 30, with a mean score of 15.7 (SD ±4.1). The risk-taking score ranged from 2 to 18, with a mean score of 9.9 (SD ±2.9). Impulsivity was seen to be significantly higher among the female students (p=0.004). The risk-taking behavior was significantly higher among the students from government schools, among the females, and among those who used the internet more. There was a significant direct association between impulsivity and risk-taking behavior among the students (correlation coefficient 0.301, p<0.001). Conclusion The study results showed that the mean impulsivity and risk-taking scores were comparable to other studies in adolescent age groups done internationally using the same tools. Impulsivity and risk-taking behavior were both found to be higher among females. There was a significant direct association between impulsivity and risk-taking.

3.
Sci Adv ; 9(14): eadc9446, 2023 04 05.
Article in English | MEDLINE | ID: mdl-37018402

ABSTRACT

The mechanisms underlying ETS-driven prostate cancer initiation and progression remain poorly understood due to a lack of model systems that recapitulate this phenotype. We generated a genetically engineered mouse with prostate-specific expression of the ETS factor, ETV4, at lower and higher protein dosage through mutation of its degron. Lower-level expression of ETV4 caused mild luminal cell expansion without histologic abnormalities, and higher-level expression of stabilized ETV4 caused prostatic intraepithelial neoplasia (mPIN) with 100% penetrance within 1 week. Tumor progression was limited by p53-mediated senescence and Trp53 deletion cooperated with stabilized ETV4. The neoplastic cells expressed differentiation markers such as Nkx3.1 recapitulating luminal gene expression features of untreated human prostate cancer. Single-cell and bulk RNA sequencing showed that stabilized ETV4 induced a previously unidentified luminal-derived expression cluster with signatures of cell cycle, senescence, and epithelial-to-mesenchymal transition. These data suggest that ETS overexpression alone, at sufficient dosage, can initiate prostate neoplasia.


Subject(s)
Prostatic Intraepithelial Neoplasia , Prostatic Neoplasms , Male , Mice , Animals , Humans , Prostate/metabolism , Prostate/pathology , Tumor Suppressor Protein p53/metabolism , Prostatic Neoplasms/genetics , Transcription Factors/metabolism , Prostatic Intraepithelial Neoplasia/genetics , Cell Transformation, Neoplastic/genetics , Gene Expression Regulation, Neoplastic , Proto-Oncogene Proteins c-ets/genetics
6.
Nat Commun ; 13(1): 5640, 2022 09 26.
Article in English | MEDLINE | ID: mdl-36163358

ABSTRACT

Structural variations (SVs) in cancer cells often impact large genomic regions with functional consequences. However, identification of SVs under positive selection is a challenging task because little is known about the genomic features related to the background breakpoint distribution in different cancers. We report a method that uses a generalized additive model to investigate the breakpoint proximity curves from 2,382 whole-genomes of 32 cancer types. We find that a multivariate model, which includes linear and nonlinear partial contributions of various tissue-specific features and their interaction terms, can explain up to 57% of the observed deviance of breakpoint proximity. In particular, three-dimensional genomic features such as topologically associating domains (TADs), TAD-boundaries and their interaction with other features show significant contributions. The model is validated by identification of known cancer genes and revealed putative drivers in cancers different than those with previous evidence of positive selection.


Subject(s)
Chromatin , Neoplasms , Genome , Genomics , Humans , Neoplasms/genetics
7.
J Clin Invest ; 132(17)2022 09 01.
Article in English | MEDLINE | ID: mdl-35852856

ABSTRACT

Immune checkpoint blockade (ICB) has demonstrated clinical success in "inflamed" tumors with substantial T cell infiltrates, but tumors with an immune-desert tumor microenvironment (TME) fail to benefit. The tumor cell-intrinsic molecular mechanisms of the immune-desert phenotype remain poorly understood. Here, we demonstrated that inactivation of the polycomb-repressive complex 2 (PRC2) core components embryonic ectoderm development (EED) or suppressor of zeste 12 homolog (SUZ12), a prevalent genetic event in malignant peripheral nerve sheath tumors (MPNSTs) and sporadically in other cancers, drove a context-dependent immune-desert TME. PRC2 inactivation reprogramed the chromatin landscape that led to a cell-autonomous shift from primed baseline signaling-dependent cellular responses (e.g., IFN-γ signaling) to PRC2-regulated developmental and cellular differentiation transcriptional programs. Further, PRC2 inactivation led to diminished tumor immune infiltrates through reduced chemokine production and impaired antigen presentation and T cell priming, resulting in primary resistance to ICB. Intratumoral delivery of inactivated modified vaccinia virus Ankara (MVA) enhanced tumor immune infiltrates and sensitized PRC2-loss tumors to ICB. Our results identify molecular mechanisms of PRC2 inactivation-mediated, context-dependent epigenetic reprogramming that underline the immune-desert phenotype in cancer. Our studies also point to intratumoral delivery of immunogenic viruses as an initial therapeutic strategy to modulate the immune-desert TME and capitalize on the clinical benefit of ICB.


Subject(s)
Neoplasms , Viruses , Chromatin , Humans , Polycomb Repressive Complex 2/genetics , Tumor Microenvironment , Viruses/genetics
8.
Science ; 376(6596): eabe1505, 2022 05 27.
Article in English | MEDLINE | ID: mdl-35617398

ABSTRACT

In castration-resistant prostate cancer (CRPC), the loss of androgen receptor (AR) dependence leads to clinically aggressive tumors with few therapeutic options. We used ATAC-seq (assay for transposase-accessible chromatin sequencing), RNA-seq, and DNA sequencing to investigate 22 organoids, six patient-derived xenografts, and 12 cell lines. We identified the well-characterized AR-dependent and neuroendocrine subtypes, as well as two AR-negative/low groups: a Wnt-dependent subtype, and a stem cell-like (SCL) subtype driven by activator protein-1 (AP-1) transcription factors. We used transcriptomic signatures to classify 366 patients, which showed that SCL is the second most common subtype of CRPC after AR-dependent. Our data suggest that AP-1 interacts with the YAP/TAZ and TEAD proteins to maintain subtype-specific chromatin accessibility and transcriptomic landscapes in this group. Together, this molecular classification reveals drug targets and can potentially guide therapeutic decisions.


Subject(s)
Chromatin , Molecular Targeted Therapy , Prostatic Neoplasms, Castration-Resistant , Cell Line, Tumor , Chromatin/genetics , Gene Expression Profiling , Humans , Male , Neoplastic Stem Cells/classification , Neoplastic Stem Cells/metabolism , Organoids/metabolism , Organoids/pathology , Prostatic Neoplasms, Castration-Resistant/classification , Prostatic Neoplasms, Castration-Resistant/drug therapy , Prostatic Neoplasms, Castration-Resistant/genetics , Receptors, Androgen/genetics , Receptors, Androgen/metabolism , Transcription Factor AP-1/genetics , Transcription Factor AP-1/metabolism
9.
Proc Natl Acad Sci U S A ; 118(51)2021 12 21.
Article in English | MEDLINE | ID: mdl-34916285

ABSTRACT

Spina bifida (SB) is a debilitating birth defect caused by multiple gene and environment interactions. Though SB shows non-Mendelian inheritance, genetic factors contribute to an estimated 70% of cases. Nevertheless, identifying human mutations conferring SB risk is challenging due to its relative rarity, genetic heterogeneity, incomplete penetrance, and environmental influences that hamper genome-wide association studies approaches to untargeted discovery. Thus, SB genetic studies may suffer from population substructure and/or selection bias introduced by typical candidate gene searches. We report a population based, ancestry-matched whole-genome sequence analysis of SB genetic predisposition using a systems biology strategy to interrogate 298 case-control subject genomes (149 pairs). Genes that were enriched in likely gene disrupting (LGD), rare protein-coding variants were subjected to machine learning analysis to identify genes in which LGD variants occur with a different frequency in cases versus controls and so discriminate between these groups. Those genes with high discriminatory potential for SB significantly enriched pathways pertaining to carbon metabolism, inflammation, innate immunity, cytoskeletal regulation, and essential transcriptional regulation consistent with their having impact on the pathogenesis of human SB. Additionally, an interrogation of conserved noncoding sequences identified robust variant enrichment in regulatory regions of several transcription factors critical to embryonic development. This genome-wide perspective offers an effective approach to the interrogation of coding and noncoding sequence variant contributions to rare complex genetic disorders.


Subject(s)
Genome, Human , Spinal Dysraphism/genetics , Case-Control Studies , Genetic Predisposition to Disease , Genome-Wide Association Study , Humans , Systems Biology , Transcription Factors/genetics
10.
Science ; 373(6559): eabc1048, 2021 Sep 03.
Article in English | MEDLINE | ID: mdl-34516843

ABSTRACT

Oncogenes only transform cells under certain cellular contexts, a phenomenon called oncogenic competence. Using a combination of a human pluripotent stem cell­derived cancer model along with zebrafish transgenesis, we demonstrate that the transforming ability of BRAFV600E along with additional mutations depends on the intrinsic transcriptional program present in the cell of origin. In both systems, melanocytes are less responsive to mutations, whereas both neural crest and melanoblast populations are readily transformed. Profiling reveals that progenitors have higher expression of chromatin-modifying enzymes such as ATAD2, a melanoma competence factor that forms a complex with SOX10 and allows for expression of downstream oncogenic and neural crest programs. These data suggest that oncogenic competence is mediated by regulation of developmental chromatin factors, which then allow for proper response to those oncogenes.


Subject(s)
Carcinogenesis/genetics , Carcinogenesis/pathology , Chromatin/metabolism , Melanoma/genetics , Melanoma/pathology , Neural Crest/pathology , ATPases Associated with Diverse Cellular Activities/genetics , ATPases Associated with Diverse Cellular Activities/metabolism , Animals , Animals, Genetically Modified , Chromatin/genetics , DNA-Binding Proteins/genetics , DNA-Binding Proteins/metabolism , Humans , Melanocytes/metabolism , Melanocytes/pathology , Mice , Neoplasms, Experimental , Neoplastic Stem Cells/pathology , Neural Crest/metabolism , Pluripotent Stem Cells/pathology , Proto-Oncogene Proteins B-raf/genetics , Proto-Oncogene Proteins B-raf/metabolism , SOXE Transcription Factors/genetics , SOXE Transcription Factors/metabolism , Transcription, Genetic , Zebrafish
11.
Nucleic Acids Res ; 49(D1): D1094-D1101, 2021 01 08.
Article in English | MEDLINE | ID: mdl-33095860

ABSTRACT

Most mutations in cancer genomes occur in the non-coding regions with unknown impact on tumor development. Although the increase in the number of cancer whole-genome sequences has revealed numerous putative non-coding cancer drivers, their information is dispersed across multiple studies making it difficult to understand their roles in tumorigenesis of different cancer types. We have developed CNCDatabase, Cornell Non-coding Cancer driver Database (https://cncdatabase.med.cornell.edu/) that contains detailed information about predicted non-coding drivers at gene promoters, 5' and 3' UTRs (untranslated regions), enhancers, CTCF insulators and non-coding RNAs. CNCDatabase documents 1111 protein-coding genes and 90 non-coding RNAs with reported drivers in their non-coding regions from 32 cancer types by computational predictions of positive selection using whole-genome sequences; differential gene expression in samples with and without mutations; or another set of experimental validations including luciferase reporter assays and genome editing. The database can be easily modified and scaled as lists of non-coding drivers are revised in the community with larger whole-genome sequencing studies, CRISPR screens and further experimental validations. Overall, CNCDatabase provides a helpful resource for researchers to explore the pathological role of non-coding alterations in human cancers.


Subject(s)
Carcinogenesis/genetics , Databases, Genetic , Gene Expression Regulation, Neoplastic , Genome, Human , Neoplasms/genetics , 3' Untranslated Regions , 5' Untranslated Regions , Carcinogenesis/metabolism , Carcinogenesis/pathology , Clustered Regularly Interspaced Short Palindromic Repeats , Enhancer Elements, Genetic , Genes, Reporter , Humans , Insulator Elements , Luciferases/genetics , Luciferases/metabolism , Mutation , Neoplasms/metabolism , Neoplasms/pathology , Open Reading Frames , Promoter Regions, Genetic , RNA, Untranslated/classification , RNA, Untranslated/genetics , RNA, Untranslated/metabolism , Untranslated Regions , Whole Genome Sequencing
12.
Cancer Discov ; 10(10): 1590-1609, 2020 10.
Article in English | MEDLINE | ID: mdl-32546576

ABSTRACT

The WNT pathway is a fundamental regulator of intestinal homeostasis, and hyperactivation of WNT signaling is the major oncogenic driver in colorectal cancer. To date, there are no described mechanisms that bypass WNT dependence in intestinal tumors. Here, we show that although WNT suppression blocks tumor growth in most organoid and in vivo colorectal cancer models, the accumulation of colorectal cancer-associated genetic alterations enables drug resistance and WNT-independent growth. In intestinal epithelial cells harboring mutations in KRAS or BRAF, together with disruption of TP53 and SMAD4, transient TGFß exposure drives YAP/TAZ-dependent transcriptional reprogramming and lineage reversion. Acquisition of embryonic intestinal identity is accompanied by a permanent loss of adult intestinal lineages, and long-term WNT-independent growth. This work identifies genetic and microenvironmental factors that drive WNT inhibitor resistance, defines a new mechanism for WNT-independent colorectal cancer growth, and reveals how integration of associated genetic alterations and extracellular signals can overcome lineage-dependent oncogenic programs. SIGNIFICANCE: Colorectal and intestinal cancers are driven by mutations in the WNT pathway, and drugs aimed at suppressing WNT signaling are in active clinical development. Our study identifies a mechanism of acquired resistance to WNT inhibition and highlights a potential strategy to target those drug-resistant cells.This article is highlighted in the In This Issue feature, p. 1426.


Subject(s)
Intestinal Neoplasms/genetics , Wnt Signaling Pathway/genetics , Animals , Cell Line, Tumor , Humans , Mice
13.
PLoS Genet ; 16(4): e1008663, 2020 04.
Article in English | MEDLINE | ID: mdl-32243438

ABSTRACT

Previous studies have surveyed the potential impact of loss-of-function (LoF) variants and identified LoF-tolerant protein-coding genes. However, the tolerance of human genomes to losing enhancers has not yet been evaluated. Here we present the catalog of LoF-tolerant enhancers using structural variants from whole-genome sequences. Using a conservative approach, we estimate that individual human genomes possess at least 28 LoF-tolerant enhancers on average. We assessed the properties of LoF-tolerant enhancers in a unified regulatory network constructed by integrating tissue-specific enhancers and gene-gene interactions. We find that LoF-tolerant enhancers tend to be more tissue-specific and regulate fewer and more dispensable genes relative to other enhancers. They are enriched in immune-related cells while enhancers with low LoF-tolerance are enriched in kidney and brain/neuronal stem cells. We developed a supervised learning approach to predict the LoF-tolerance of all enhancers, which achieved an area under the receiver operating characteristics curve (AUROC) of 98%. We predict 3,519 more enhancers would be likely tolerant to LoF and 129 enhancers that would have low LoF-tolerance. Our predictions are supported by a known set of disease enhancers and novel deletions from PacBio sequencing. The LoF-tolerance scores provided here will serve as an important reference for disease studies.


Subject(s)
Enhancer Elements, Genetic/genetics , Genome, Human/genetics , Loss of Function Mutation , Conserved Sequence , Disease/genetics , Gene Expression Regulation , Genetic Predisposition to Disease , Humans , Organ Specificity/genetics , ROC Curve , Reproducibility of Results , Supervised Machine Learning
14.
Genome Biol ; 21(1): 79, 2020 03 26.
Article in English | MEDLINE | ID: mdl-32216817

ABSTRACT

Non-coding variants have been shown to be related to disease by alteration of 3D genome structures. We propose a deep learning method, DeepMILO, to predict the effects of variants on CTCF/cohesin-mediated insulator loops. Application of DeepMILO on variants from whole-genome sequences of 1834 patients of twelve cancer types revealed 672 insulator loops disrupted in at least 10% of patients. Our results show mutations at loop anchors are associated with upregulation of the cancer driver genes BCL2 and MYC in malignant lymphoma thus pointing to a possible new mechanism for their dysregulation via alteration of insulator loops.


Subject(s)
Chromatin/chemistry , Deep Learning , Insulator Elements , Neoplasms/genetics , CCCTC-Binding Factor/metabolism , Cell Cycle Proteins/metabolism , Cell Line, Tumor , Chromosomal Proteins, Non-Histone/metabolism , Humans , Mutation , Whole Genome Sequencing , Cohesins
15.
Nat Commun ; 11(1): 729, 2020 02 05.
Article in English | MEDLINE | ID: mdl-32024854

ABSTRACT

The catalog of cancer driver mutations in protein-coding genes has greatly expanded in the past decade. However, non-coding cancer driver mutations are less well-characterized and only a handful of recurrent non-coding mutations, most notably TERT promoter mutations, have been reported. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 cancer across 38 tumor types, we perform multi-faceted pathway and network analyses of non-coding mutations across 2583 whole cancer genomes from 27 tumor types compiled by the ICGC/TCGA PCAWG project that was motivated by the success of pathway and network analyses in prioritizing rare mutations in protein-coding genes. While few non-coding genomic elements are recurrently mutated in this cohort, we identify 93 genes harboring non-coding mutations that cluster into several modules of interacting proteins. Among these are promoter mutations associated with reduced mRNA expression in TP53, TLE4, and TCF4. We find that biological processes had variable proportions of coding and non-coding mutations, with chromatin remodeling and proliferation pathways altered primarily by coding mutations, while developmental pathways, including Wnt and Notch, altered by both coding and non-coding mutations. RNA splicing is primarily altered by non-coding mutations in this cohort, and samples containing non-coding mutations in well-known RNA splicing factors exhibit similar gene expression signatures as samples with coding mutations in these genes. These analyses contribute a new repertoire of possible cancer genes and mechanisms that are altered by non-coding mutations and offer insights into additional cancer vulnerabilities that can be investigated for potential therapeutic treatments.


Subject(s)
Gene Expression Regulation, Neoplastic , Mutation , Neoplasms/genetics , RNA Splicing , Chromatin Assembly and Disassembly , Computational Biology/methods , Databases, Genetic , Genome, Human , Humans , Metabolic Networks and Pathways/genetics , Neoplasms/metabolism , Promoter Regions, Genetic
16.
Nature ; 578(7793): 112-121, 2020 02.
Article in English | MEDLINE | ID: mdl-32025012

ABSTRACT

A key mutational process in cancer is structural variation, in which rearrangements delete, amplify or reorder genomic segments that range in size from kilobases to whole chromosomes1-7. Here we develop methods to group, classify and describe somatic structural variants, using data from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA), which aggregated whole-genome sequencing data from 2,658 cancers across 38 tumour types8. Sixteen signatures of structural variation emerged. Deletions have a multimodal size distribution, assort unevenly across tumour types and patients, are enriched in late-replicating regions and correlate with inversions. Tandem duplications also have a multimodal size distribution, but are enriched in early-replicating regions-as are unbalanced translocations. Replication-based mechanisms of rearrangement generate varied chromosomal structures with low-level copy-number gains and frequent inverted rearrangements. One prominent structure consists of 2-7 templates copied from distinct regions of the genome strung together within one locus. Such cycles of templated insertions correlate with tandem duplications, and-in liver cancer-frequently activate the telomerase gene TERT. A wide variety of rearrangement processes are active in cancer, which generate complex configurations of the genome upon which selection can act.


Subject(s)
Genetic Variation , Genome, Human/genetics , Neoplasms/genetics , Gene Rearrangement/genetics , Genomics , Humans , Mutagenesis, Insertional , Telomerase/genetics
17.
Cell ; 180(5): 915-927.e16, 2020 03 05.
Article in English | MEDLINE | ID: mdl-32084333

ABSTRACT

The dichotomous model of "drivers" and "passengers" in cancer posits that only a few mutations in a tumor strongly affect its progression, with the remaining ones being inconsequential. Here, we leveraged the comprehensive variant dataset from the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) project to demonstrate that-in addition to the dichotomy of high- and low-impact variants-there is a third group of medium-impact putative passengers. Moreover, we also found that molecular impact correlates with subclonal architecture (i.e., early versus late mutations), and different signatures encode for mutations with divergent impact. Furthermore, we adapted an additive-effects model from complex-trait studies to show that the aggregated effect of putative passengers, including undetected weak drivers, provides significant additional power (∼12% additive variance) for predicting cancerous phenotypes, beyond PCAWG-identified driver mutations. Finally, this framework allowed us to estimate the frequency of potential weak-driver mutations in PCAWG samples lacking any well-characterized driver alterations.


Subject(s)
Genome, Human/genetics , Genomics/methods , Mutation/genetics , Neoplasms/genetics , DNA Mutational Analysis/methods , Disease Progression , Humans , Neoplasms/pathology , Whole Genome Sequencing
18.
Cell Syst ; 8(5): 446-455.e8, 2019 05 22.
Article in English | MEDLINE | ID: mdl-31078526

ABSTRACT

Recent studies have shown that mutations at non-coding elements, such as promoters and enhancers, can act as cancer drivers. However, an important class of non-coding elements, namely CTCF insulators, has been overlooked in the previous driver analyses. We used insulator annotations from CTCF and cohesin ChIA-PET and analyzed somatic mutations in 1,962 whole genomes from 21 cancer types. Using the heterogeneous patterns of transcription-factor-motif disruption, functional impact, and recurrence of mutations, we developed a computational method that revealed 21 insulators showing signals of positive selection. In particular, mutations in an insulator in multiple cancer types, including 16% of melanoma samples, are associated with TGFB1 up-regulation. Using CRISPR-Cas9, we find that alterations at two of the most frequently mutated regions in this insulator increase cell growth by 40%-50%, supporting the role of this boundary element as a cancer driver. Thus, our study reveals several CTCF insulators as putative cancer drivers.


Subject(s)
CCCTC-Binding Factor/genetics , CCCTC-Binding Factor/metabolism , Animals , Cell Cycle Proteins/genetics , Chromosomal Proteins, Non-Histone/genetics , DNA-Binding Proteins/genetics , Gene Expression Regulation/genetics , Gene Expression Regulation, Neoplastic/genetics , Genome, Human , Humans , Mutation , Neoplasms/genetics , Neoplasms/metabolism , Promoter Regions, Genetic/genetics , Repressor Proteins/genetics , Cohesins
20.
Am J Hum Genet ; 102(5): 920-942, 2018 05 03.
Article in English | MEDLINE | ID: mdl-29727691

ABSTRACT

We describe a method based on a latent Dirichlet allocation model for predicting functional effects of noncoding genetic variants in a cell-type- and/or tissue-specific way (FUN-LDA). Using this unsupervised approach, we predict tissue-specific functional effects for every position in the human genome in 127 different tissues and cell types. We demonstrate the usefulness of our predictions by using several validation experiments. Using eQTL data from several sources, including the GTEx project, Geuvadis project, and TwinsUK cohort, we show that eQTLs in specific tissues tend to be most enriched among the predicted functional variants in relevant tissues in Roadmap. We further show how these integrated functional scores can be used for (1) deriving the most likely cell or tissue type causally implicated for a complex trait by using summary statistics from genome-wide association studies and (2) estimating a tissue-based correlation matrix of various complex traits. We found large enrichment of heritability in functional components of relevant tissues for various complex traits, and FUN-LDA yielded higher enrichment estimates than existing methods. Finally, using experimentally validated functional variants from the literature and variants possibly implicated in disease by previous studies, we rigorously compare FUN-LDA with state-of-the-art functional annotation methods and show that FUN-LDA has better prediction accuracy and higher resolution than these methods. In particular, our results suggest that tissue- and cell-type-specific functional prediction methods tend to have substantially better prediction accuracy than organism-level prediction methods. Scores for each position in the human genome and for each ENCODE and Roadmap tissue are available online (see Web Resources).


Subject(s)
Algorithms , DNA, Intergenic/genetics , Genetic Variation , Models, Genetic , Organ Specificity/genetics , Genome-Wide Association Study , Humans , Linkage Disequilibrium/genetics , Molecular Sequence Annotation , Polymorphism, Single Nucleotide/genetics , Probability , Quantitative Trait Loci/genetics , Reproducibility of Results , Twins/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...