Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 3 de 3
Filter
1.
Sci Rep ; 10(1): 12055, 2020 07 21.
Article in English | MEDLINE | ID: mdl-32694572

ABSTRACT

Genomic prediction of complex human traits (e.g., height, cognitive ability, bone density) and disease risks (e.g., breast cancer, diabetes, heart disease, atrial fibrillation) has advanced considerably in recent years. Using data from the UK Biobank, predictors have been constructed using penalized algorithms that favor sparsity: i.e., which use as few genetic variants as possible. We analyze the specific genetic variants (SNPs) utilized in these predictors, which can vary from dozens to as many as thirty thousand. We find that the fraction of SNPs in or near genic regions varies widely by phenotype. For the majority of disease conditions studied, a large amount of the variance is accounted for by SNPs outside of coding regions. The state of these SNPs cannot be determined from exome-sequencing data. This suggests that exome data alone will miss much of the heritability for these traits-i.e., existing PRS cannot be computed from exome data alone. We also study the fraction of SNPs and of variance that is in common between pairs of predictors. The DNA regions used in disease risk predictors so far constructed seem to be largely disjoint (with a few interesting exceptions), suggesting that individual genetic disease risks are largely uncorrelated. It seems possible in theory for an individual to be a low-risk outlier in all conditions simultaneously.


Subject(s)
Genetic Association Studies , Genetic Predisposition to Disease , Models, Genetic , Multifactorial Inheritance , Quantitative Trait, Heritable , Algorithms , Cluster Analysis , Humans , Polymorphism, Single Nucleotide , Exome Sequencing
2.
Sci Rep ; 9(1): 17515, 2019 Nov 20.
Article in English | MEDLINE | ID: mdl-31748697

ABSTRACT

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

3.
Sci Rep ; 9(1): 15286, 2019 10 25.
Article in English | MEDLINE | ID: mdl-31653892

ABSTRACT

We construct risk predictors using polygenic scores (PGS) computed from common Single Nucleotide Polymorphisms (SNPs) for a number of complex disease conditions, using L1-penalized regression (also known as LASSO) on case-control data from UK Biobank. Among the disease conditions studied are Hypothyroidism, (Resistant) Hypertension, Type 1 and 2 Diabetes, Breast Cancer, Prostate Cancer, Testicular Cancer, Gallstones, Glaucoma, Gout, Atrial Fibrillation, High Cholesterol, Asthma, Basal Cell Carcinoma, Malignant Melanoma, and Heart Attack. We obtain values for the area under the receiver operating characteristic curves (AUC) in the range ~0.58-0.71 using SNP data alone. Substantially higher predictor AUCs are obtained when incorporating additional variables such as age and sex. Some SNP predictors alone are sufficient to identify outliers (e.g., in the 99th percentile of polygenic score, or PGS) with 3-8 times higher risk than typical individuals. We validate predictors out-of-sample using the eMERGE dataset, and also with different ancestry subgroups within the UK Biobank population. Our results indicate that substantial improvements in predictive power are attainable using training sets with larger case populations. We anticipate rapid improvement in genomic prediction as more case-control data become available for analysis.


Subject(s)
Breast Neoplasms/genetics , Diabetes Mellitus, Type 1/genetics , Diabetes Mellitus, Type 2/genetics , Genomics/methods , Myocardial Infarction/genetics , Prostatic Neoplasms/genetics , Algorithms , Breast Neoplasms/diagnosis , Case-Control Studies , Diabetes Mellitus, Type 1/diagnosis , Diabetes Mellitus, Type 2/diagnosis , Female , Genetic Predisposition to Disease/genetics , Humans , Male , Models, Genetic , Multifactorial Inheritance , Myocardial Infarction/diagnosis , Polymorphism, Single Nucleotide , Prognosis , Prostatic Neoplasms/diagnosis , ROC Curve , Risk Assessment/methods , Risk Assessment/statistics & numerical data , Risk Factors
SELECTION OF CITATIONS
SEARCH DETAIL
...