Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 1.666
Filter
1.
Transl Psychiatry ; 14(1): 235, 2024 Jun 03.
Article in English | MEDLINE | ID: mdl-38830892

ABSTRACT

There is a lack of knowledge regarding the relationship between proneness to dimensional psychopathological syndromes and the underlying pathogenesis across major psychiatric disorders, i.e., Major Depressive Disorder (MDD), Bipolar Disorder (BD), Schizoaffective Disorder (SZA), and Schizophrenia (SZ). Lifetime psychopathology was assessed using the OPerational CRITeria (OPCRIT) system in 1,038 patients meeting DSM-IV-TR criteria for MDD, BD, SZ, or SZA. The cohort was split into two samples for exploratory and confirmatory factor analyses. All patients were scanned with 3-T MRI, and data was analyzed with the CAT-12 toolbox in SPM12. Psychopathological factor scores were correlated with gray matter volume (GMV) and cortical thickness (CT). Finally, factor scores were used for exploratory genetic analyses including genome-wide association studies (GWAS) and polygenic risk score (PRS) association analyses. Three factors (paranoid-hallucinatory syndrome, PHS; mania, MA; depression, DEP) were identified and cross-validated. PHS was negatively correlated with four GMV clusters comprising parts of the hippocampus, amygdala, angular, middle occipital, and middle frontal gyri. PHS was also negatively associated with the bilateral superior temporal, left parietal operculum, and right angular gyrus CT. No significant brain correlates were observed for the two other psychopathological factors. We identified genome-wide significant associations for MA and DEP. PRS for MDD and SZ showed a positive effect on PHS, while PRS for BD showed a positive effect on all three factors. This study investigated the relationship of lifetime psychopathological factors and brain morphometric and genetic markers. Results highlight the need for dimensional approaches, overcoming the limitations of the current psychiatric nosology.


Subject(s)
Bipolar Disorder , Depressive Disorder, Major , Genome-Wide Association Study , Gray Matter , Magnetic Resonance Imaging , Psychotic Disorders , Schizophrenia , Humans , Male , Female , Adult , Bipolar Disorder/genetics , Bipolar Disorder/pathology , Bipolar Disorder/diagnostic imaging , Depressive Disorder, Major/genetics , Depressive Disorder, Major/diagnostic imaging , Depressive Disorder, Major/pathology , Schizophrenia/genetics , Schizophrenia/pathology , Schizophrenia/diagnostic imaging , Psychotic Disorders/genetics , Psychotic Disorders/diagnostic imaging , Psychotic Disorders/pathology , Gray Matter/pathology , Gray Matter/diagnostic imaging , Middle Aged , Factor Analysis, Statistical , Brain/pathology , Brain/diagnostic imaging , Psychopathology , Multifactorial Inheritance/genetics , Cerebral Cortex/pathology , Cerebral Cortex/diagnostic imaging
2.
Am J Hum Genet ; 111(5): 833-840, 2024 May 02.
Article in English | MEDLINE | ID: mdl-38701744

ABSTRACT

Some commercial firms currently sell polygenic indexes (PGIs) to individual consumers, despite their relatively low predictive power. It might be tempting to assume that because the predictive power of many PGIs is so modest, other sorts of firms-such as those selling insurance and financial services-will not be interested in using PGIs for their own purposes. We argue to the contrary. We build this argument in two ways. First, we offer a very simple model, rooted in economic theory, of a profit-maximizing firm that can gain information about a single consumer's genome. We use the model to show that, depending on the specific economic environment, a firm would be willing to pay for statistically noisy PGIs, even if they allow for only a small reduction in uncertainty. Second, we describe two plausible scenarios in which these different kinds of firms could conceivably use PGIs to maximize profits. Finally, we briefly discuss some of the associated ethics and policy issues. They deserve more attention, which is unlikely to be given until it is first recognized that firms whose services affect a large swath of the public will indeed have incentives to use PGIs.


Subject(s)
Multifactorial Inheritance , Humans , Multifactorial Inheritance/genetics , Genetic Testing/ethics , Genetic Testing/economics
3.
Transl Vis Sci Technol ; 13(5): 13, 2024 May 01.
Article in English | MEDLINE | ID: mdl-38767906

ABSTRACT

Purpose: The purpose of this study was to conduct a large-scale genome-wide association study (GWAS) and construct a polygenic risk score (PRS) for risk stratification in patients with dry eye disease (DED) using the Taiwan Biobank (TWB) databases. Methods: This retrospective case-control study involved 40,112 subjects of Han Chinese ancestry, sourced from the publicly available TWB. Cases were patients with DED (n = 14,185), and controls were individuals without DED (n = 25,927). The patients with DED were further divided into 8072 young (<60 years old) and 6113 old participants (≥60 years old). Using PLINK (version 1.9) software, quality control was carried out, followed by logistic regression analysis with adjustments for sex, age, body mass index, depression, and manic episodes as covariates. We also built PRS prediction models using the standard clumping and thresholding method and evaluated their performance (area under the curve [AUC]) through five-fold cross-validation. Results: Eleven independent risk loci were identified for these patients with DED at the genome-wide significance levels, including DNAJB6, MAML3, LINC02267, DCHS1, SIRPB3P, HULC, MUC16, GAS2L3, and ZFPM2. Among these, MUC16 encodes mucin family protein. The PRS model incorporated 932 and 740 genetic loci for young and old populations, respectively. A higher PRS score indicated a greater DED risk, with the top 5% of PRS individuals having a 10-fold higher risk. After integrating these covariates into the PRS model, the area under the receiver operating curve (AUROC) increased from 0.509 and 0.537 to 0.600 and 0.648 for young and old populations, respectively, demonstrating the genetic-environmental interaction. Conclusions: Our study prompts potential candidates for the mechanism of DED and paves the way for more personalized medication in the future. Translational Relevance: Our study identified genes related to DED and constructed a PRS model to improve DED prediction.


Subject(s)
Dry Eye Syndromes , Genetic Predisposition to Disease , Genome-Wide Association Study , Multifactorial Inheritance , Humans , Female , Male , Middle Aged , Retrospective Studies , Dry Eye Syndromes/genetics , Dry Eye Syndromes/epidemiology , Case-Control Studies , Genetic Predisposition to Disease/genetics , Adult , Multifactorial Inheritance/genetics , Aged , Risk Factors , Risk Assessment/methods , Polymorphism, Single Nucleotide , Taiwan/epidemiology , Genetic Risk Score
4.
Hum Genomics ; 18(1): 49, 2024 May 22.
Article in English | MEDLINE | ID: mdl-38778357

ABSTRACT

BACKGROUND: Given the high prevalence of BPH among elderly men, pinpointing those at elevated risk can aid in early intervention and effective management. This study aimed to explore that polygenic risk score (PRS) is effective in predicting benign prostatic hyperplasia (BPH) incidence, prognosis and risk of operation in Han Chinese. METHODS: A retrospective cohort study included 12,474 male participants (6,237 with BPH and 6,237 non-BPH controls) from the Taiwan Precision Medicine Initiative (TPMI). Genotyping was performed using the Affymetrix Genome-Wide TWB 2.0 SNP Array. PRS was calculated using PGS001865, comprising 1,712 single nucleotide polymorphisms. Logistic regression models assessed the association between PRS and BPH incidence, adjusting for age and prostate-specific antigen (PSA) levels. The study also examined the relationship between PSA, prostate volume, and response to 5-α-reductase inhibitor (5ARI) treatment, as well as the association between PRS and the risk of TURP. RESULTS: Individuals in the highest PRS quartile (Q4) had a significantly higher risk of BPH compared to the lowest quartile (Q1) (OR = 1.51, 95% CI = 1.274-1.783, p < 0.0001), after adjusting for PSA level. The Q4 group exhibited larger prostate volumes and a smaller volume reduction after 5ARI treatment. The Q1 group had a lower cumulative TURP probability at 3, 5, and 10 years compared to the Q4 group. PRS Q4 was an independent risk factor for TURP. CONCLUSIONS: In this Han Chinese cohort, higher PRS was associated with an increased susceptibility to BPH, larger prostate volumes, poorer response to 5ARI treatment, and a higher risk of TURP. Larger prospective studies with longer follow-up are warranted to further validate these findings.


Subject(s)
Genetic Predisposition to Disease , Multifactorial Inheritance , Polymorphism, Single Nucleotide , Prostatic Hyperplasia , Humans , Male , Prostatic Hyperplasia/genetics , Prostatic Hyperplasia/pathology , Aged , Middle Aged , Polymorphism, Single Nucleotide/genetics , Retrospective Studies , Multifactorial Inheritance/genetics , Asian People/genetics , Risk Factors , 5-alpha Reductase Inhibitors/therapeutic use , Prostate-Specific Antigen/blood , Prostate-Specific Antigen/genetics , Taiwan/epidemiology , Prognosis , Prostate/pathology , Genetic Risk Score , East Asian People
5.
Elife ; 122024 May 24.
Article in English | MEDLINE | ID: mdl-38787369

ABSTRACT

Rich data from large biobanks, coupled with increasingly accessible association statistics from genome-wide association studies (GWAS), provide great opportunities to dissect the complex relationships among human traits and diseases. We introduce BADGERS, a powerful method to perform polygenic score-based biobank-wide association scans. Compared to traditional approaches, BADGERS uses GWAS summary statistics as input and does not require multiple traits to be measured in the same cohort. We applied BADGERS to two independent datasets for late-onset Alzheimer's disease (AD; n=61,212). Among 1738 traits in the UK biobank, we identified 48 significant associations for AD. Family history, high cholesterol, and numerous traits related to intelligence and education showed strong and independent associations with AD. Furthermore, we identified 41 significant associations for a variety of AD endophenotypes. While family history and high cholesterol were strongly associated with AD subgroups and pathologies, only intelligence and education-related traits predicted pre-clinical cognitive phenotypes. These results provide novel insights into the distinct biological processes underlying various risk factors for AD.


Subject(s)
Alzheimer Disease , Biological Specimen Banks , Endophenotypes , Genome-Wide Association Study , Alzheimer Disease/genetics , Humans , Risk Factors , Male , Female , United Kingdom/epidemiology , Aged , Genetic Predisposition to Disease , Multifactorial Inheritance/genetics , Aged, 80 and over
6.
Nat Commun ; 15(1): 4260, 2024 May 20.
Article in English | MEDLINE | ID: mdl-38769300

ABSTRACT

Transcriptome-wide association study (TWAS) is a popular approach to dissect the functional consequence of disease associated non-coding variants. Most existing TWAS use bulk tissues and may not have the resolution to reveal cell-type specific target genes. Single-cell expression quantitative trait loci (sc-eQTL) datasets are emerging. The largest bulk- and sc-eQTL datasets are most conveniently available as summary statistics, but have not been broadly utilized in TWAS. Here, we present a new method EXPRESSO (EXpression PREdiction with Summary Statistics Only), to analyze sc-eQTL summary statistics, which also integrates 3D genomic data and epigenomic annotation to prioritize causal variants. EXPRESSO substantially improves existing methods. We apply EXPRESSO to analyze multi-ancestry GWAS datasets for 14 autoimmune diseases. EXPRESSO uniquely identifies 958 novel gene x trait associations, which is 26% more than the second-best method. Among them, 492 are unique to cell type level analysis and missed by TWAS using whole blood. We also develop a cell type aware drug repurposing pipeline, which leverages EXPRESSO results to identify drug compounds that can reverse disease gene expressions in relevant cell types. Our results point to multiple drugs with therapeutic potentials, including metformin for type 1 diabetes, and vitamin K for ulcerative colitis.


Subject(s)
Genome-Wide Association Study , Quantitative Trait Loci , Single-Cell Analysis , Humans , Single-Cell Analysis/methods , Genome-Wide Association Study/methods , Genetic Predisposition to Disease/genetics , Transcriptome/genetics , Autoimmune Diseases/genetics , Polymorphism, Single Nucleotide , Multifactorial Inheritance/genetics , Gene Expression Profiling/methods
7.
Nat Genet ; 56(5): 838-845, 2024 May.
Article in English | MEDLINE | ID: mdl-38741015

ABSTRACT

Autoimmune and inflammatory diseases are polygenic disorders of the immune system. Many genomic loci harbor risk alleles for several diseases, but the limited resolution of genetic mapping prevents determining whether the same allele is responsible, indicating a shared underlying mechanism. Here, using a collection of 129,058 cases and controls across 6 diseases, we show that ~40% of overlapping associations are due to the same allele. We improve fine-mapping resolution for shared alleles twofold by combining cases and controls across diseases, allowing us to identify more expression quantitative trait loci driven by the shared alleles. The patterns indicate widespread sharing of pathogenic mechanisms but not a single global autoimmune mechanism. Our approach can be applied to any set of traits and is particularly valuable as sample collections become depleted.


Subject(s)
Alleles , Autoimmune Diseases , Chromosome Mapping , Genetic Predisposition to Disease , Quantitative Trait Loci , Humans , Autoimmune Diseases/genetics , Polymorphism, Single Nucleotide , Genome-Wide Association Study , Case-Control Studies , Multifactorial Inheritance/genetics
8.
Nat Commun ; 15(1): 4230, 2024 May 18.
Article in English | MEDLINE | ID: mdl-38762475

ABSTRACT

Type 2 diabetes (T2D) presents a formidable global health challenge, highlighted by its escalating prevalence, underscoring the critical need for precision health strategies and early detection initiatives. Leveraging artificial intelligence, particularly eXtreme Gradient Boosting (XGBoost), we devise robust risk assessment models for T2D. Drawing upon comprehensive genetic and medical imaging datasets from 68,911 individuals in the Taiwan Biobank, our models integrate Polygenic Risk Scores (PRS), Multi-image Risk Scores (MRS), and demographic variables, such as age, sex, and T2D family history. Here, we show that our model achieves an Area Under the Receiver Operating Curve (AUC) of 0.94, effectively identifying high-risk T2D subgroups. A streamlined model featuring eight key variables also maintains a high AUC of 0.939. This high accuracy for T2D risk assessment promises to catalyze early detection and preventive strategies. Moreover, we introduce an accessible online risk assessment tool for T2D, facilitating broader applicability and dissemination of our findings.


Subject(s)
Artificial Intelligence , Diabetes Mellitus, Type 2 , Diabetes Mellitus, Type 2/genetics , Humans , Risk Assessment/methods , Female , Male , Middle Aged , Taiwan/epidemiology , Genetic Predisposition to Disease , Adult , Diagnostic Imaging/methods , Aged , Risk Factors , ROC Curve , Multifactorial Inheritance/genetics
9.
Sci Rep ; 14(1): 11632, 2024 05 21.
Article in English | MEDLINE | ID: mdl-38773257

ABSTRACT

In recent years, the utility of polygenic risk scores (PRS) in forecasting disease susceptibility from genome-wide association studies (GWAS) results has been widely recognised. Yet, these models face limitations due to overfitting and the potential overestimation of effect sizes in correlated variants. To surmount these obstacles, we devised the Stacked Neural Network Polygenic Risk Score (SNPRS). This novel approach synthesises outputs from multiple neural network models, each calibrated using genetic variants chosen based on diverse p-value thresholds. By doing so, SNPRS captures a broader array of genetic variants, enabling a more nuanced interpretation of the combined effects of these variants. We assessed the efficacy of SNPRS using the UK Biobank data, focusing on the genetic risks associated with breast and prostate cancers, as well as quantitative traits like height and BMI. We also extended our analysis to the Korea Genome and Epidemiology Study (KoGES) dataset. Impressively, our results indicate that SNPRS surpasses traditional PRS models and an isolated deep neural network in terms of accuracy, highlighting its promise in refining the efficacy and relevance of PRS in genetic studies.


Subject(s)
Genetic Predisposition to Disease , Genome-Wide Association Study , Multifactorial Inheritance , Neural Networks, Computer , Polymorphism, Single Nucleotide , Humans , Multifactorial Inheritance/genetics , Genome-Wide Association Study/methods , Female , Male , Prostatic Neoplasms/genetics , Breast Neoplasms/genetics , Risk Factors , Genetic Risk Score
10.
Int J Mol Sci ; 25(9)2024 Apr 23.
Article in English | MEDLINE | ID: mdl-38731822

ABSTRACT

Our understanding of rare disease genetics has been shaped by a monogenic disease model. While the traditional monogenic disease model has been successful in identifying numerous disease-associated genes and significantly enlarged our knowledge in the field of human genetics, it has limitations in explaining phenomena like phenotypic variability and reduced penetrance. Widening the perspective beyond Mendelian inheritance has the potential to enable a better understanding of disease complexity in rare disorders. Digenic inheritance is the simplest instance of a non-Mendelian disorder, characterized by the functional interplay of variants in two disease-contributing genes. Known digenic disease causes show a range of pathomechanisms underlying digenic interplay, including direct and indirect gene product interactions as well as epigenetic modifications. This review aims to systematically explore the background of digenic inheritance in rare disorders, the approaches and challenges when investigating digenic inheritance, and the current evidence for digenic inheritance in mitochondrial disorders.


Subject(s)
Mitochondrial Diseases , Rare Diseases , Humans , Mitochondrial Diseases/genetics , Rare Diseases/genetics , Genetic Predisposition to Disease , Epigenesis, Genetic , Multifactorial Inheritance/genetics , Animals
11.
PLoS One ; 19(5): e0303610, 2024.
Article in English | MEDLINE | ID: mdl-38758931

ABSTRACT

We have previously shown that polygenic risk scores (PRS) can improve risk stratification of peripheral artery disease (PAD) in a large, retrospective cohort. Here, we evaluate the potential of PRS in improving the detection of PAD and prediction of major adverse cardiovascular and cerebrovascular events (MACCE) and adverse events (AE) in an institutional patient cohort. We created a cohort of 278 patients (52 cases and 226 controls) and fit a PAD-specific PRS based on the weighted sum of risk alleles. We built traditional clinical risk models and machine learning (ML) models using clinical and genetic variables to detect PAD, MACCE, and AE. The models' performances were measured using the area under the curve (AUC), net reclassification index (NRI), integrated discrimination improvement (IDI), and Brier score. We also evaluated the clinical utility of our PAD model using decision curve analysis (DCA). We found a modest, but not statistically significant improvement in the PAD detection model's performance with the inclusion of PRS from 0.902 (95% CI: 0.846-0.957) (clinical variables only) to 0.909 (95% CI: 0.856-0.961) (clinical variables with PRS). The PRS inclusion significantly improved risk re-classification of PAD with an NRI of 0.07 (95% CI: 0.002-0.137), p = 0.04. For our ML model predicting MACCE, the addition of PRS did not significantly improve the AUC, however, NRI analysis demonstrated significant improvement in risk re-classification (p = 2e-05). Decision curve analysis showed higher net benefit of our combined PRS-clinical model across all thresholds of PAD detection. Including PRS to a clinical PAD-risk model was associated with improvement in risk stratification and clinical utility, although we did not see a significant change in AUC. This result underscores the potential clinical utility of incorporating PRS data into clinical risk models for prevalent PAD and the need for use of evaluation metrics that can discern the clinical impact of using new biomarkers in smaller populations.


Subject(s)
Peripheral Arterial Disease , Humans , Peripheral Arterial Disease/genetics , Peripheral Arterial Disease/diagnosis , Female , Male , Aged , Middle Aged , Risk Assessment/methods , Risk Factors , Machine Learning , Cardiovascular Diseases/genetics , Cardiovascular Diseases/diagnosis , Retrospective Studies , Multifactorial Inheritance/genetics , Case-Control Studies , Area Under Curve , Genetic Risk Score
12.
Nat Commun ; 15(1): 3346, 2024 May 01.
Article in English | MEDLINE | ID: mdl-38693125

ABSTRACT

Endurance exercise training is known to reduce risk for a range of complex diseases. However, the molecular basis of this effect has been challenging to study and largely restricted to analyses of either few or easily biopsied tissues. Extensive transcriptome data collected across 15 tissues during exercise training in rats as part of the Molecular Transducers of Physical Activity Consortium has provided a unique opportunity to clarify how exercise can affect tissue-specific gene expression and further suggest how exercise adaptation may impact complex disease-associated genes. To build this map, we integrate this multi-tissue atlas of gene expression changes with gene-disease targets, genetic regulation of expression, and trait relationship data in humans. Consensus from multiple approaches prioritizes specific tissues and genes where endurance exercise impacts disease-relevant gene expression. Specifically, we identify a total of 5523 trait-tissue-gene triplets to serve as a valuable starting point for future investigations [Exercise; Transcription; Human Phenotypic Variation].


Subject(s)
Gene Expression Regulation , Physical Conditioning, Animal , Animals , Humans , Rats , Transcriptome/genetics , Multifactorial Inheritance/genetics , Exercise/physiology , Male , Phenotype , Quantitative Trait Loci , Gene Expression Profiling
13.
Sci Rep ; 14(1): 12436, 2024 05 30.
Article in English | MEDLINE | ID: mdl-38816422

ABSTRACT

We construct non-linear machine learning (ML) prediction models for systolic and diastolic blood pressure (SBP, DBP) using demographic and clinical variables and polygenic risk scores (PRSs). We developed a two-model ensemble, consisting of a baseline model, where prediction is based on demographic and clinical variables only, and a genetic model, where we also include PRSs. We evaluate the use of a linear versus a non-linear model at both the baseline and the genetic model levels and assess the improvement in performance when incorporating multiple PRSs. We report the ensemble model's performance as percentage variance explained (PVE) on a held-out test dataset. A non-linear baseline model improved the PVEs from 28.1 to 30.1% (SBP) and 14.3% to 17.4% (DBP) compared with a linear baseline model. Including seven PRSs in the genetic model computed based on the largest available GWAS of SBP/DBP improved the genetic model PVE from 4.8 to 5.1% (SBP) and 4.7 to 5% (DBP) compared to using a single PRS. Adding additional 14 PRSs computed based on two independent GWASs further increased the genetic model PVE to 6.3% (SBP) and 5.7% (DBP). PVE differed across self-reported race/ethnicity groups, with primarily all non-White groups benefitting from the inclusion of additional PRSs. In summary, non-linear ML models improves BP prediction in models incorporating diverse populations.


Subject(s)
Blood Pressure , Genome-Wide Association Study , Machine Learning , Multifactorial Inheritance , Phenotype , Humans , Blood Pressure/genetics , Multifactorial Inheritance/genetics , Genome-Wide Association Study/methods , Risk Factors , Male , Female , Genetic Predisposition to Disease , Models, Genetic , Hypertension/genetics , Hypertension/physiopathology , Middle Aged , Genetic Risk Score
14.
Nat Commun ; 15(1): 4433, 2024 May 29.
Article in English | MEDLINE | ID: mdl-38811555

ABSTRACT

Dominance heritability in complex traits has received increasing recognition. However, most polygenic score (PGS) approaches do not incorporate non-additive effects. Here, we present GenoBoost, a flexible PGS modeling framework capable of considering both additive and non-additive effects, specifically focusing on genetic dominance. Building on statistical boosting theory, we derive provably optimal GenoBoost scores and provide its efficient implementation for analyzing large-scale cohorts. We benchmark it against seven commonly used PGS methods and demonstrate its competitive predictive performance. GenoBoost is ranked the best for four traits and second-best for three traits among twelve tested disease outcomes in UK Biobank. We reveal that GenoBoost improves prediction for autoimmune diseases by incorporating non-additive effects localized in the MHC locus and, more broadly, works best in less polygenic traits. We further demonstrate that GenoBoost can infer the mode of genetic inheritance without requiring prior knowledge. For example, GenoBoost finds non-zero genetic dominance effects for 602 of 900 selected genetic variants, resulting in 2.5% improvements in predicting psoriasis cases. Lastly, we show that GenoBoost can prioritize genetic loci with genetic dominance not previously reported in the GWAS catalog. Our results highlight the increased accuracy and biological insights from incorporating non-additive effects in PGS models.


Subject(s)
Genome-Wide Association Study , Models, Genetic , Multifactorial Inheritance , Multifactorial Inheritance/genetics , Humans , Genome-Wide Association Study/methods , Polymorphism, Single Nucleotide , Genetic Predisposition to Disease , Autoimmune Diseases/genetics , Genes, Dominant , Psoriasis/genetics
15.
Nat Genet ; 56(5): 819-826, 2024 May.
Article in English | MEDLINE | ID: mdl-38741014

ABSTRACT

We performed genome-wide association studies of breast cancer including 18,034 cases and 22,104 controls of African ancestry. Genetic variants at 12 loci were associated with breast cancer risk (P < 5 × 10-8), including associations of a low-frequency missense variant rs61751053 in ARHGEF38 with overall breast cancer (odds ratio (OR) = 1.48) and a common variant rs76664032 at chromosome 2q14.2 with triple-negative breast cancer (TNBC) (OR = 1.30). Approximately 15.4% of cases with TNBC carried six risk alleles in three genome-wide association study-identified TNBC risk variants, with an OR of 4.21 (95% confidence interval = 2.66-7.03) compared with those carrying fewer than two risk alleles. A polygenic risk score (PRS) showed an area under the receiver operating characteristic curve of 0.60 for the prediction of breast cancer risk, which outperformed PRS derived using data from females of European ancestry. Our study markedly increases the population diversity in genetic studies for breast cancer and demonstrates the utility of PRS for risk prediction in females of African ancestry.


Subject(s)
Black People , Breast Neoplasms , Genetic Predisposition to Disease , Genome-Wide Association Study , Polymorphism, Single Nucleotide , Humans , Female , Genome-Wide Association Study/methods , Breast Neoplasms/genetics , Black People/genetics , Case-Control Studies , Risk Factors , Triple Negative Breast Neoplasms/genetics , Alleles , Multifactorial Inheritance/genetics , Middle Aged , Genetic Loci , White People/genetics
16.
Sci Rep ; 14(1): 12586, 2024 06 01.
Article in English | MEDLINE | ID: mdl-38822050

ABSTRACT

Frailty is a complex trait. Twin studies and high-powered Genome Wide Association Studies conducted in the UK Biobank have demonstrated a strong genetic basis of frailty. The present study utilized summary statistics from a Genome Wide Association Study on the Frailty Index to create and test the predictive power of frailty polygenic risk scores (PRS) in two independent samples - the Lothian Birth Cohort 1936 (LBC1936) and the English Longitudinal Study of Ageing (ELSA) aged 67-84 years. Multiple regression models were built to test the predictive power of frailty PRS at five time points. Frailty PRS significantly predicted frailty, measured via the FI, at all-time points in LBC1936 and ELSA, explaining 2.1% (ß = 0.15, 95%CI, 0.085-0.21) and 1.8% (ß = 0.14, 95%CI, 0.10-0.17) of the variance, respectively, at age ~ 68/ ~ 70 years (p < 0.001). This work demonstrates that frailty PRS can predict frailty in two independent cohorts, particularly at early ages (~ 68/ ~ 70). PRS have the potential to be valuable instruments for identifying those at risk for frailty and could be important for controlling for genetic confounders in epidemiological studies.


Subject(s)
Aging , Frailty , Genome-Wide Association Study , Multifactorial Inheritance , Humans , Aged , Frailty/genetics , Longitudinal Studies , Aged, 80 and over , Female , Male , Multifactorial Inheritance/genetics , Aging/genetics , Birth Cohort , Risk Factors , England/epidemiology , Genetic Risk Score
18.
PLoS One ; 19(4): e0298906, 2024.
Article in English | MEDLINE | ID: mdl-38625909

ABSTRACT

Detecting epistatic drivers of human phenotypes is a considerable challenge. Traditional approaches use regression to sequentially test multiplicative interaction terms involving pairs of genetic variants. For higher-order interactions and genome-wide large-scale data, this strategy is computationally intractable. Moreover, multiplicative terms used in regression modeling may not capture the form of biological interactions. Building on the Predictability, Computability, Stability (PCS) framework, we introduce the epiTree pipeline to extract higher-order interactions from genomic data using tree-based models. The epiTree pipeline first selects a set of variants derived from tissue-specific estimates of gene expression. Next, it uses iterative random forests (iRF) to search training data for candidate Boolean interactions (pairwise and higher-order). We derive significance tests for interactions, based on a stabilized likelihood ratio test, by simulating Boolean tree-structured null (no epistasis) and alternative (epistasis) distributions on hold-out test data. Finally, our pipeline computes PCS epistasis p-values that probabilisticly quantify improvement in prediction accuracy via bootstrap sampling on the test set. We validate the epiTree pipeline in two case studies using data from the UK Biobank: predicting red hair and multiple sclerosis (MS). In the case of predicting red hair, epiTree recovers known epistatic interactions surrounding MC1R and novel interactions, representing non-linearities not captured by logistic regression models. In the case of predicting MS, a more complex phenotype than red hair, epiTree rankings prioritize novel interactions surrounding HLA-DRB1, a variant previously associated with MS in several populations. Taken together, these results highlight the potential for epiTree rankings to help reduce the design space for follow up experiments.


Subject(s)
Epistasis, Genetic , Genome-Wide Association Study , Humans , Genome-Wide Association Study/methods , Phenotype , Multifactorial Inheritance/genetics , Logistic Models , Polymorphism, Single Nucleotide
19.
Elife ; 122024 04 19.
Article in English | MEDLINE | ID: mdl-38639992

ABSTRACT

We propose a new framework for human genetic association studies: at each locus, a deep learning model (in this study, Sei) is used to calculate the functional genomic activity score for two haplotypes per individual. This score, defined as the Haplotype Function Score (HFS), replaces the original genotype in association studies. Applying the HFS framework to 14 complex traits in the UK Biobank, we identified 3619 independent HFS-trait associations with a significance of p < 5 × 10-8. Fine-mapping revealed 2699 causal associations, corresponding to a median increase of 63 causal findings per trait compared with single-nucleotide polymorphism (SNP)-based analysis. HFS-based enrichment analysis uncovered 727 pathway-trait associations and 153 tissue-trait associations with strong biological interpretability, including 'circadian pathway-chronotype' and 'arachidonic acid-intelligence'. Lastly, we applied least absolute shrinkage and selection operator (LASSO) regression to integrate HFS prediction score with SNP-based polygenic risk scores, which showed an improvement of 16.1-39.8% in cross-ancestry polygenic prediction. We concluded that HFS is a promising strategy for understanding the genetic basis of human complex traits.


Scattered throughout the human genome are variations in the genetic code that make individuals more or less likely to develop certain traits. To identify these variants, scientists carry out Genome-wide association studies (GWAS) which compare the DNA variants of large groups of people with and without the trait of interest. This method has been able to find the underlying genes for many human diseases, but it has limitations. For instance, some variations are linked together due to where they are positioned within DNA, which can result in GWAS falsely reporting associations between genetic variants and traits. This phenomenon, known as linkage equilibrium, can be avoided by analyzing functional genomics which looks at the multiple ways a gene's activity can be influenced by a variation. For instance, how the gene is copied and decoded in to proteins and RNA molecules, and the rate at which these products are generated. Researchers can now use an artificial intelligence technique called deep learning to generate functional genomic data from a particular DNA sequence. Here, Song et al. used one of these deep learning models to calculate the functional genomics of haplotypes, groups of genetic variants inherited from one parent. The approach was applied to DNA samples from over 350 thousand individuals included in the UK BioBank. An activity score, defined as the haplotype function score (or HFS for short), was calculated for at least two haplotypes per individual, and then compared to various complex traits like height or bone density. Song et al. found that the HFS framework was better at finding links between genes and specific traits than existing methods. It also provided more information on the biology that may be underpinning these outcomes. Although more work is needed to reduce the computer processing times required to calculate the HFS, Song et al. believe that their new method has the potential to improve the way researchers identify links between genes and human traits.


Subject(s)
Multifactorial Inheritance , Quantitative Trait Loci , Humans , Haplotypes , Multifactorial Inheritance/genetics , Genome-Wide Association Study , Polymorphism, Single Nucleotide , Phenotype
20.
Sci Rep ; 14(1): 9642, 2024 04 26.
Article in English | MEDLINE | ID: mdl-38671065

ABSTRACT

Chronic kidney disease (CKD) is a complex disorder that causes a gradual loss of kidney function, affecting approximately 9.1% of the world's population. Here, we use a soft-clustering algorithm to deconstruct its genetic heterogeneity. First, we selected 322 CKD-associated independent genetic variants from published genome-wide association studies (GWAS) and added association results for 229 traits from the GWAS catalog. We then applied nonnegative matrix factorization (NMF) to discover overlapping clusters of related traits and variants. We computed cluster-specific polygenic scores and validated each cluster with a phenome-wide association study (PheWAS) on the BioMe biobank (n = 31,701). NMF identified nine clusters that reflect different aspects of CKD, with the top-weighted traits signifying areas such as kidney function, type 2 diabetes (T2D), and body weight. For most clusters, the top-weighted traits were confirmed in the PheWAS analysis. Results were found to be more significant in the cross-ancestry analysis, although significant ancestry-specific associations were also identified. While all alleles were associated with a decreased kidney function, associations with CKD-related diseases (e.g., T2D) were found only for a smaller subset of variants and differed across genetic ancestry groups. Our findings leverage genetics to gain insights into the underlying biology of CKD and investigate population-specific associations.


Subject(s)
Genome-Wide Association Study , Phenotype , Renal Insufficiency, Chronic , Humans , Renal Insufficiency, Chronic/genetics , Renal Insufficiency, Chronic/pathology , Cluster Analysis , Multifactorial Inheritance/genetics , Genetic Predisposition to Disease , Polymorphism, Single Nucleotide , Algorithms , Diabetes Mellitus, Type 2/genetics , Male , Female
SELECTION OF CITATIONS
SEARCH DETAIL
...