Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 78
Filter
1.
Am J Hum Genet ; 2024 May 23.
Article in English | MEDLINE | ID: mdl-38821058

ABSTRACT

Both trio and population designs are popular study designs for identifying risk genetic variants in genome-wide association studies (GWASs). The trio design, as a family-based design, is robust to confounding due to population structure, whereas the population design is often more powerful due to larger sample sizes. Here, we propose KnockoffHybrid, a knockoff-based statistical method for hybrid analysis of both the trio and population designs. KnockoffHybrid provides a unified framework that brings together the advantages of both designs and produces powerful hybrid analysis while controlling the false discovery rate (FDR) in the presence of linkage disequilibrium and population structure. Furthermore, KnockoffHybrid has the flexibility to leverage different types of summary statistics for hybrid analyses, including expression quantitative trait loci (eQTL) and GWAS summary statistics. We demonstrate in simulations that KnockoffHybrid offers power gains over non-hybrid methods for the trio and population designs with the same number of cases while controlling the FDR with complex correlation among variants and population structure among subjects. In hybrid analyses of three trio cohorts for autism spectrum disorders (ASDs) from the Autism Speaks MSSNG, Autism Sequencing Consortium, and Autism Genome Project with GWAS summary statistics from the iPSYCH project and eQTL summary statistics from the MetaBrain project, KnockoffHybrid outperforms conventional methods by replicating several known risk genes for ASDs and identifying additional associations with variants in other genes, including the PRAME family genes involved in axon guidance and which may act as common targets for human speech/language evolution and related disorders.

2.
Cell Rep ; 43(5): 114240, 2024 May 28.
Article in English | MEDLINE | ID: mdl-38753486

ABSTRACT

Adipose tissue remodeling and dysfunction, characterized by elevated inflammation and insulin resistance, play a central role in obesity-related development of type 2 diabetes (T2D) and cardiovascular diseases. Long intergenic non-coding RNAs (lincRNAs) are important regulators of cellular functions. Here, we describe the functions of linc-ADAIN (adipose anti-inflammatory), an adipose lincRNA that is downregulated in white adipose tissue of obese humans. We demonstrate that linc-ADAIN knockdown (KD) increases KLF5 and interleukin-8 (IL-8) mRNA stability and translation by interacting with IGF2BP2. Upregulation of KLF5 and IL-8, via linc-ADAIN KD, leads to an enhanced adipogenic program and adipose tissue inflammation, mirroring the obese state, in vitro and in vivo. KD of linc-ADAIN in human adipose stromal cell (ASC) hTERT adipocytes implanted into mice increases adipocyte size and macrophage infiltration compared to implanted control adipocytes, mimicking hallmark features of obesity-induced adipose tissue remodeling. linc-ADAIN is an anti-inflammatory lincRNA that limits adipose tissue expansion and lipid storage.


Subject(s)
Adipogenesis , Interleukin-8 , Kruppel-Like Transcription Factors , RNA Stability , RNA, Long Noncoding , Humans , RNA, Long Noncoding/genetics , RNA, Long Noncoding/metabolism , Kruppel-Like Transcription Factors/metabolism , Kruppel-Like Transcription Factors/genetics , Adipogenesis/genetics , Animals , RNA Stability/genetics , Interleukin-8/metabolism , Interleukin-8/genetics , Mice , RNA-Binding Proteins/metabolism , RNA-Binding Proteins/genetics , Adipocytes/metabolism , Adipose Tissue/metabolism , Obesity/metabolism , Obesity/genetics , Obesity/pathology , RNA, Messenger/metabolism , RNA, Messenger/genetics , Male , Inflammation/pathology , Inflammation/genetics , Inflammation/metabolism
3.
bioRxiv ; 2024 May 02.
Article in English | MEDLINE | ID: mdl-38464202

ABSTRACT

Understanding the causal genetic architecture of complex phenotypes is essential for future research into disease mechanisms and potential therapies. Here, we present a novel framework for genome-wide detection of sets of variants that carry non-redundant information on the phenotypes and are therefore more likely to be causal in a biological sense. Crucially, our framework requires only summary statistics obtained from standard genome-wide marginal association testing. The described approach, implemented in open-source software, is also computationally efficient, requiring less than 15 minutes on a single CPU to perform genome-wide analysis. Through extensive genome-wide simulation studies, we show that the method can substantially outperform usual two-stage marginal association testing and fine-mapping procedures in precision and recall. In applications to a meta-analysis of ten large-scale genetic studies of Alzheimer's disease (AD), we identified 82 loci associated with AD, including 37 additional loci missed by conventional GWAS pipeline. The identified putative causal variants achieve state-of-the-art agreement with massively parallel reporter assays and CRISPR-Cas9 experiments. Additionally, we applied the method to a retrospective analysis of 67 large-scale GWAS summary statistics since 2013 for a variety of phenotypes. Results reveal the method's capacity to robustly discover additional loci for polygenic traits and pinpoint potential causal variants underpinning each locus beyond conventional GWAS pipeline, contributing to a deeper understanding of complex genetic architectures in post-GWAS analyses.

4.
Genet Med ; 25(12): 100983, 2023 Dec.
Article in English | MEDLINE | ID: mdl-37746849

ABSTRACT

PURPOSE: Previous work identified rare variants in DSTYK associated with human congenital anomalies of the kidney and urinary tract (CAKUT). Here, we present a series of mouse and human studies to clarify the association, penetrance, and expressivity of DSTYK variants. METHODS: We phenotypically characterized Dstyk knockout mice of 3 separate inbred backgrounds and re-analyzed the original family segregating the DSTYK c.654+1G>A splice-site variant (referred to as "SSV" below). DSTYK loss of function (LOF) and SSVs were annotated in individuals with CAKUT, epilepsy, or amyotrophic lateral sclerosis vs controls. A phenome-wide association study analysis was also performed using United Kingdom Biobank (UKBB) data. RESULTS: Results demonstrate ∼20% to 25% penetrance of obstructive uropathy, at least, in C57BL/6J and FVB/NJ Dstyk-/- mice. Phenotypic penetrance increased to ∼40% in C3H/HeJ mutants, with mild-to-moderate severity. Re-analysis of the original family segregating the rare SSV showed low penetrance (43.8%) and no alternative genetic causes for CAKUT. LOF DSTYK variants burden showed significant excess for CAKUT and epilepsy vs controls and an exploratory phenome-wide association study supported association with neurological disorders. CONCLUSION: These data support causality for DSTYK LOF variants and highlights the need for large-scale sequencing studies (here >200,000 cases) to accurately assess causality for genes and variants to lowly penetrant traits with common population prevalence.


Subject(s)
Epilepsy , Urinary Tract , Urogenital Abnormalities , Animals , Mice , Humans , Penetrance , Mice, Inbred C3H , Mice, Inbred C57BL , Urogenital Abnormalities/genetics , Kidney/abnormalities , Risk Factors , Epilepsy/genetics , Receptor-Interacting Protein Serine-Threonine Kinases/genetics
5.
bioRxiv ; 2023 Jun 07.
Article in English | MEDLINE | ID: mdl-37333162

ABSTRACT

Genome-wide association studies (GWAS) for biomarkers important for clinical phenotypes can lead to clinically relevant discoveries. GWAS for quantitative traits are based on simplified regression models modeling the conditional mean of a phenotype as a linear function of genotype. An alternative and easy to apply approach is quantile regression that naturally extends linear regression to the analysis of the entire conditional distribution of a phenotype of interest by modeling conditional quantiles within a regression framework. Quantile regression can be applied efficiently at biobank scale using standard statistical packages in much the same way as linear regression, while having some unique advantages such as identifying variants with heterogeneous effects across different quantiles, including non-additive effects and variants involved in gene-environment interactions; accommodating a wide range of phenotype distributions with invariance to trait transformation; and overall providing more detailed information about the underlying genotype-phenotype associations. Here, we demonstrate the value of quantile regression in the context of GWAS by applying it to 39 quantitative traits in the UK Biobank (n>300,000 individuals). Across these 39 traits we identify 7,297 significant loci, including 259 loci only detected by quantile regression. We show that quantile regression can help uncover replicable but unmodelled gene-environment interactions, and can provide additional key insights into poorly understood genotype-phenotype correlations for clinically relevant biomarkers at minimal additional cost.

6.
Nat Genet ; 55(6): 1057-1065, 2023 06.
Article in English | MEDLINE | ID: mdl-37169873

ABSTRACT

Fine-mapping is commonly used to identify putative causal variants at genome-wide significant loci. Here we propose a Bayesian model for fine-mapping that has several advantages over existing methods, including flexible specification of the prior distribution of effect sizes, joint modeling of summary statistics and functional annotations and accounting for discrepancies between summary statistics and external linkage disequilibrium in meta-analyses. Using simulations, we compare performance with commonly used fine-mapping methods and show that the proposed model has higher power and lower false discovery rate (FDR) when including functional annotations, and higher power, lower FDR and higher coverage for credible sets in meta-analyses. We further illustrate our approach by applying it to a meta-analysis of Alzheimer's disease genome-wide association studies where we prioritize putatively causal variants and genes.


Subject(s)
Genome-Wide Association Study , Polymorphism, Single Nucleotide , Genome-Wide Association Study/methods , Bayes Theorem , Linkage Disequilibrium
7.
EClinicalMedicine ; 57: 101864, 2023 Mar.
Article in English | MEDLINE | ID: mdl-36820096

ABSTRACT

Background: Osteoporosis heavily affects postmenopausal women and is influenced by environmental exposures. Determining the impact of criteria air pollutants and their mixtures on bone mineral density (BMD) in postmenopausal women is an urgent priority. Methods: We conducted a prospective observational study using data from the ethnically diverse Women's Health Initiative Study (WHI) (enrollment, September 1994-December 1998; data analysis, January 2020 to August 2022). We used log-normal, ordinary kriging to estimate daily mean concentrations of PM10, NO, NO2, and SO2 at participants' geocoded addresses (1-, 3-, and 5-year averages before BMD assessments). We measured whole-body, total hip, femoral neck, and lumbar spine BMD at enrollment and follow-up (Y1, Y3, Y6) via dual-energy X-ray absorptiometry. We estimated associations using multivariable linear and linear mixed-effects models and mixture effects using Bayesian kernel machine regression (BKMR) models. Findings: In cross-sectional and longitudinal analyses, mean PM10, NO, NO2, and SO2 averaged over 1, 3, and 5 years before the visit were negatively associated with whole-body, total hip, femoral neck, and lumbar spine BMD. For example, lumbar spine BMD decreased 0.026 (95% CI: 0.016, 0.036) g/cm2/year per a 10% increase in 3-year mean NO2 concentration. BKMR suggested that nitrogen oxides exposure was inversely associated with whole-body and lumbar spine BMD. Interpretation: In this cohort study, higher levels of air pollutants were associated with bone damage, particularly on lumbar spine, among postmenopausal women. These findings highlight nitrogen oxides exposure as a leading contributor to bone loss in postmenopausal women, expanding previous findings of air pollution-related bone damage. Funding: US National Institutes of Health.

8.
Genome Biol ; 24(1): 24, 2023 02 13.
Article in English | MEDLINE | ID: mdl-36782330

ABSTRACT

We propose BIGKnock (BIobank-scale Gene-based association test via Knockoffs), a computationally efficient gene-based testing approach for biobank-scale data, that leverages long-range chromatin interaction data, and performs conditional genome-wide testing via knockoffs. BIGKnock can prioritize causal genes over proxy associations at a locus. We apply BIGKnock to the UK Biobank data with 405,296 participants for multiple binary and quantitative traits, and show that relative to conventional gene-based tests, BIGKnock produces smaller sets of significant genes that contain the causal gene(s) with high probability. We further illustrate its ability to pinpoint potential causal genes at [Formula: see text] of the associated loci.


Subject(s)
Biological Specimen Banks , Genetic Testing , Humans , Chromosome Mapping , Phenotype , Chromatin , Genome-Wide Association Study , Polymorphism, Single Nucleotide
10.
J Am Soc Nephrol ; 34(4): 607-618, 2023 04 01.
Article in English | MEDLINE | ID: mdl-36302597

ABSTRACT

SIGNIFICANCE STATEMENT: Pathogenic structural genetic variants, also known as genomic disorders, have been associated with pediatric CKD. This study extends those results across the lifespan, with genomic disorders enriched in both pediatric and adult patients compared with controls. In the Chronic Renal Insufficiency Cohort study, genomic disorders were also associated with lower serum Mg, lower educational performance, and a higher risk of death. A phenome-wide association study confirmed the link between kidney disease and genomic disorders in an unbiased way. Systematic detection of genomic disorders can provide a molecular diagnosis and refine prediction of risk and prognosis. BACKGROUND: Genomic disorders (GDs) are associated with many comorbid outcomes, including CKD. Identification of GDs has diagnostic utility. METHODS: We examined the prevalence of GDs among participants in the Chronic Kidney Disease in Children (CKiD) cohort II ( n =248), Chronic Renal Insufficiency Cohort (CRIC) study ( n =3375), Columbia University CKD Biobank (CU-CKD; n =1986), and the Family Investigation of Nephropathy and Diabetes (FIND; n =1318) compared with 30,746 controls. We also performed a phenome-wide association analysis (PheWAS) of GDs in the electronic MEdical Records and GEnomics (eMERGE; n =11,146) cohort. RESULTS: We found nine out of 248 (3.6%) CKiD II participants carried a GD, replicating prior findings in pediatric CKD. We also identified GDs in 72 out of 6679 (1.1%) adult patients with CKD in the CRIC, CU-CKD, and FIND cohorts, compared with 199 out of 30,746 (0.65%) GDs in controls (OR, 1.7; 95% CI, 1.3 to 2.2). Among adults with CKD, we found recurrent GDs at the 1q21.1, 16p11.2, 17q12, and 22q11.2 loci. The 17q12 GD (diagnostic of renal cyst and diabetes syndrome) was most frequent, present in 1:252 patients with CKD and diabetes. In the PheWAS, dialysis and neuropsychiatric phenotypes were the top associations with GDs. In CRIC participants, GDs were associated with lower serum magnesium, lower educational achievement, and higher mortality risk. CONCLUSION: Undiagnosed GDs are detected both in children and adults with CKD. Identification of GDs in these patients can enable a precise genetic diagnosis, inform prognosis, and help stratify risk in clinical studies. GDs could also provide a molecular explanation for nephropathy and comorbidities, such as poorer neurocognition for a subset of patients.


Subject(s)
Longevity , Renal Insufficiency, Chronic , Humans , Cohort Studies , Prospective Studies , Renal Insufficiency, Chronic/epidemiology , Renal Insufficiency, Chronic/genetics , Renal Insufficiency, Chronic/complications , Genomics , Disease Progression , Risk Factors
11.
Nat Commun ; 13(1): 7209, 2022 11 23.
Article in English | MEDLINE | ID: mdl-36418338

ABSTRACT

Recent advances in genome sequencing and imputation technologies provide an exciting opportunity to comprehensively study the contribution of genetic variants to complex phenotypes. However, our ability to translate genetic discoveries into mechanistic insights remains limited at this point. In this paper, we propose an efficient knockoff-based method, GhostKnockoff, for genome-wide association studies (GWAS) that leads to improved power and ability to prioritize putative causal variants relative to conventional GWAS approaches. The method requires only Z-scores from conventional GWAS and hence can be easily applied to enhance existing and future studies. The method can also be applied to meta-analysis of multiple GWAS allowing for arbitrary sample overlap. We demonstrate its performance using empirical simulations and two applications: (1) a meta-analysis for Alzheimer's disease comprising nine overlapping large-scale GWAS, whole-exome and whole-genome sequencing studies and (2) analysis of 1403 binary phenotypes from the UK Biobank data in 408,961 samples of European ancestry. Our results demonstrate that GhostKnockoff can identify putatively functional variants with weaker statistical effects that are missed by conventional association tests.


Subject(s)
Genome-Wide Association Study , Polymorphism, Single Nucleotide , Genome-Wide Association Study/methods , Phenotype , Causality , Chromosome Mapping
12.
Nat Commun ; 13(1): 6859, 2022 11 11.
Article in English | MEDLINE | ID: mdl-36369178

ABSTRACT

Immunoglobulin A (IgA) mediates mucosal responses to food antigens and the intestinal microbiome and is involved in susceptibility to mucosal pathogens, celiac disease, inflammatory bowel disease, and IgA nephropathy. We performed a genome-wide association study of serum IgA levels in 41,263 individuals of diverse ancestries and identified 20 genome-wide significant loci, including 9 known and 11 novel loci. Co-localization analyses with expression QTLs prioritized candidate genes for 14 of 20 significant loci. Most loci encoded genes that produced immune defects and IgA abnormalities when genetically manipulated in mice. We also observed positive genetic correlations of serum IgA levels with IgA nephropathy, type 2 diabetes, and body mass index, and negative correlations with celiac disease, inflammatory bowel disease, and several infections. Mendelian randomization supported elevated serum IgA as a causal factor in IgA nephropathy. African ancestry was consistently associated with higher serum IgA levels and greater frequency of IgA-increasing alleles compared to other ancestries. Our findings provide novel insights into the genetic regulation of IgA levels and its potential role in human disease.


Subject(s)
Celiac Disease , Diabetes Mellitus, Type 2 , Glomerulonephritis, IGA , Inflammatory Bowel Diseases , Humans , Mice , Animals , Glomerulonephritis, IGA/genetics , Glomerulonephritis, IGA/complications , Genome-Wide Association Study , Celiac Disease/genetics , Genetic Predisposition to Disease , Diabetes Mellitus, Type 2/complications , Immunoglobulin A/genetics , Kidney/metabolism
13.
Am J Hum Genet ; 109(10): 1761-1776, 2022 10 06.
Article in English | MEDLINE | ID: mdl-36150388

ABSTRACT

Family-based designs can eliminate confounding due to population substructure and can distinguish direct from indirect genetic effects, but these designs are underpowered due to limited sample sizes. Here, we propose KnockoffTrio, a statistical method to identify putative causal genetic variants for father-mother-child trio design built upon a recently developed knockoff framework in statistics. KnockoffTrio controls the false discovery rate (FDR) in the presence of arbitrary correlations among tests and is less conservative and thus more powerful than the conventional methods that control the family-wise error rate via Bonferroni correction. Furthermore, KnockoffTrio is not restricted to family-based association tests and can be used in conjunction with more powerful, potentially nonlinear models to improve the power of standard family-based tests. We show, using empirical simulations, that KnockoffTrio can prioritize causal variants over associations due to linkage disequilibrium and can provide protection against confounding due to population stratification. In applications to 14,200 trios from three study cohorts for autism spectrum disorders (ASDs), including AGP, SPARK, and SSC, we show that KnockoffTrio can identify multiple significant associations that are missed by conventional tests applied to the same data. In particular, we replicate known ASD association signals with variants in several genes such as MACROD2, NRXN1, PRKAR1B, CADM2, PCDH9, and DOCK4 and identify additional associations with variants in other genes including ARHGEF10, SLC28A1, ZNF589, and HINT1 at FDR 10%.


Subject(s)
Autism Spectrum Disorder , Genome-Wide Association Study , Autism Spectrum Disorder/genetics , Causality , Genome-Wide Association Study/methods , Humans , Linkage Disequilibrium , Nerve Tissue Proteins/genetics
14.
Nat Med ; 28(7): 1412-1420, 2022 07.
Article in English | MEDLINE | ID: mdl-35710995

ABSTRACT

Chronic kidney disease (CKD) is a common complex condition associated with high morbidity and mortality. Polygenic prediction could enhance CKD screening and prevention; however, this approach has not been optimized for ancestrally diverse populations. By combining APOL1 risk genotypes with genome-wide association studies (GWAS) of kidney function, we designed, optimized and validated a genome-wide polygenic score (GPS) for CKD. The new GPS was tested in 15 independent cohorts, including 3 cohorts of European ancestry (n = 97,050), 6 cohorts of African ancestry (n = 14,544), 4 cohorts of Asian ancestry (n = 8,625) and 2 admixed Latinx cohorts (n = 3,625). We demonstrated score transferability with reproducible performance across all tested cohorts. The top 2% of the GPS was associated with nearly threefold increased risk of CKD across ancestries. In African ancestry cohorts, the APOL1 risk genotype and polygenic component of the GPS had additive effects on the risk of CKD.


Subject(s)
Apolipoprotein L1 , Renal Insufficiency, Chronic , Apolipoprotein L1/genetics , Genetic Predisposition to Disease/genetics , Genome-Wide Association Study , Genotype , Humans , Multifactorial Inheritance/genetics , Polymorphism, Single Nucleotide/genetics , Renal Insufficiency, Chronic/diagnosis , Renal Insufficiency, Chronic/genetics
15.
Genetics ; 221(2)2022 05 31.
Article in English | MEDLINE | ID: mdl-35385101

ABSTRACT

Genomic regions subject to purifying selection are more likely to carry disease-causing mutations than regions not under selection. Cross species conservation is often used to identify such regions but with limited resolution to detect selection on short evolutionary timescales such as that occurring in only one species. In contrast, genetic intolerance looks for depletion of variation relative to expectation within a species, allowing species-specific features to be identified. When estimating the intolerance of noncoding sequence, methods strongly leverage variant frequency distributions. As the expected distributions depend on ancestry, if not properly controlled for, ancestral population source may obfuscate signals of selection. We demonstrate that properly incorporating ancestry in intolerance estimation greatly improved variant classification. We provide a genome-wide intolerance map that is conditional on ancestry and likely to be particularly valuable for variant prioritization.


Subject(s)
Genome, Human , Genomics , Biological Evolution , Genetics, Population , Humans , Selection, Genetic
16.
Nat Commun ; 13(1): 800, 2022 02 10.
Article in English | MEDLINE | ID: mdl-35145093

ABSTRACT

Alopecia areata is a complex genetic disease that results in hair loss due to the autoimmune-mediated attack of the hair follicle. We previously defined a role for both rare and common variants in our earlier GWAS and linkage studies. Here, we identify rare variants contributing to Alopecia Areata using a whole exome sequencing and gene-level burden analyses approach on 849 Alopecia Areata patients compared to 15,640 controls. KRT82 is identified as an Alopecia Areata risk gene with rare damaging variants in 51 heterozygous Alopecia Areata individuals (6.01%), achieving genome-wide significance (p = 2.18E-07). KRT82 encodes a hair-specific type II keratin that is exclusively expressed in the hair shaft cuticle during anagen phase, and its expression is decreased in Alopecia Areata patient skin and hair follicles. Finally, we find that cases with an identified damaging KRT82 variant and reduced KRT82 expression have elevated perifollicular CD8 infiltrates. In this work, we utilize whole exome sequencing to successfully identify a significant Alopecia Areata disease-relevant gene, KRT82, and reveal a proposed mechanism for rare variant predisposition leading to disrupted hair shaft integrity.


Subject(s)
Alopecia Areata/genetics , Alopecia Areata/metabolism , Exome Sequencing , Keratins, Hair-Specific/genetics , Keratins, Type II/genetics , Genetic Predisposition to Disease , Genetic Variation , Hair/metabolism , Hair Follicle/metabolism , Humans , Skin/metabolism
17.
Am J Hum Genet ; 109(3): 446-456, 2022 03 03.
Article in English | MEDLINE | ID: mdl-35216679

ABSTRACT

Attempts to identify and prioritize functional DNA elements in coding and non-coding regions, particularly through use of in silico functional annotation data, continue to increase in popularity. However, specific functional roles can vary widely from one variant to another, making it challenging to summarize different aspects of variant function with a one-dimensional rating. Here we propose multi-dimensional annotation-class integrative estimation (MACIE), an unsupervised multivariate mixed-model framework capable of integrating annotations of diverse origin to assess multi-dimensional functional roles for both coding and non-coding variants. Unlike existing one-dimensional scoring methods, MACIE views variant functionality as a composite attribute encompassing multiple characteristics and estimates the joint posterior functional probabilities of each genomic position. This estimate offers more comprehensive and interpretable information in the presence of multiple aspects of functionality. Applied to a variety of independent coding and non-coding datasets, MACIE demonstrates powerful and robust performance in discriminating between functional and non-functional variants. We also show an application of MACIE to fine-mapping and heritability enrichment analysis by using the lipids GWAS summary statistics data from the European Network for Genetic and Genomic Epidemiology Consortium.


Subject(s)
Genome, Human , Genome-Wide Association Study , Genome, Human/genetics , Genome-Wide Association Study/methods , Genomics , Humans , Molecular Sequence Annotation , Polymorphism, Single Nucleotide/genetics , Probability
18.
Proc Natl Acad Sci U S A ; 118(47)2021 11 23.
Article in English | MEDLINE | ID: mdl-34799441

ABSTRACT

Gene-based tests are valuable techniques for identifying genetic factors in complex traits. Here, we propose a gene-based testing framework that incorporates data on long-range chromatin interactions, several recent technical advances for region-based tests, and leverages the knockoff framework for synthetic genotype generation for improved gene discovery. Through simulations and applications to genome-wide association studies (GWAS) and whole-genome sequencing data for multiple diseases and traits, we show that the proposed test increases the power over state-of-the-art gene-based tests in the literature, identifies genes that replicate in larger studies, and can provide a more narrow focus on the possible causal genes at a locus by reducing the confounding effect of linkage disequilibrium. Furthermore, our results show that incorporating genetic variation in distal regulatory elements tends to improve power over conventional tests. Results for UK Biobank and BioBank Japan traits are also available in a publicly accessible database that allows researchers to query gene-based results in an easy fashion.


Subject(s)
Chromatin , Genetic Testing/methods , Genotype , Genome-Wide Association Study/methods , Humans , Japan , Linkage Disequilibrium , Lung , Models, Genetic , Phenotype , Quantitative Trait Loci , Whole Genome Sequencing/methods
19.
Am J Hum Genet ; 108(12): 2336-2353, 2021 12 02.
Article in English | MEDLINE | ID: mdl-34767756

ABSTRACT

Knockoff-based methods have become increasingly popular due to their enhanced power for locus discovery and their ability to prioritize putative causal variants in a genome-wide analysis. However, because of the substantial computational cost for generating knockoffs, existing knockoff approaches cannot analyze millions of rare genetic variants in biobank-scale whole-genome sequencing and whole-genome imputed datasets. We propose a scalable knockoff-based method for the analysis of common and rare variants across the genome, KnockoffScreen-AL, that is applicable to biobank-scale studies with hundreds of thousands of samples and millions of genetic variants. The application of KnockoffScreen-AL to the analysis of Alzheimer disease (AD) in 388,051 WG-imputed samples from the UK Biobank resulted in 31 significant loci, including 14 loci that are missed by conventional association tests on these data. We perform replication studies in an independent meta-analysis of clinically diagnosed AD with 94,437 samples, and additionally leverage single-cell RNA-sequencing data with 143,793 single-nucleus transcriptomes from 17 control subjects and AD-affected individuals, and proteomics data from 735 control subjects and affected indviduals with AD and related disorders to validate the genes at these significant loci. These multi-omics analyses show that 79.1% of the proximal genes at these loci and 76.2% of the genes at loci identified only by KnockoffScreen-AL exhibit at least suggestive signal (p < 0.05) in the scRNA-seq or proteomics analyses. We highlight a potentially causal gene in AD progression, EGFR, that shows significant differences in expression and protein levels between AD-affected individuals and healthy control subjects.


Subject(s)
Alzheimer Disease/genetics , Biological Specimen Banks , Gene Knockout Techniques , Genes, erbB-1 , Genetic Variation , Genome-Wide Association Study , Humans , RNA-Seq , Transcriptome , Whole Genome Sequencing
20.
PLoS Genet ; 17(8): e1009713, 2021 08.
Article in English | MEDLINE | ID: mdl-34460823

ABSTRACT

Genome-wide association studies (GWASs) have uncovered a wealth of associations between common variants and human phenotypes. Here, we present an integrative analysis of GWAS summary statistics from 36 phenotypes to decipher multitrait genetic architecture and its link with biological mechanisms. Our framework incorporates multitrait association mapping along with an investigation of the breakdown of genetic associations into clusters of variants harboring similar multitrait association profiles. Focusing on two subsets of immunity and metabolism phenotypes, we then demonstrate how genetic variants within clusters can be mapped to biological pathways and disease mechanisms. Finally, for the metabolism set, we investigate the link between gene cluster assignment and the success of drug targets in randomized controlled trials.


Subject(s)
Computational Biology/methods , Polymorphism, Single Nucleotide , Quantitative Trait Loci , Cluster Analysis , Genetic Predisposition to Disease , Genome-Wide Association Study , Humans , Phenotype
SELECTION OF CITATIONS
SEARCH DETAIL
...