Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
2.
Sci Rep ; 13(1): 11662, 2023 07 19.
Artigo em Inglês | MEDLINE | ID: mdl-37468507

RESUMO

In this paper we characterize the performance of linear models trained via widely-used sparse machine learning algorithms. We build polygenic scores and examine performance as a function of training set size, genetic ancestral background, and training method. We show that predictor performance is most strongly dependent on size of training data, with smaller gains from algorithmic improvements. We find that LASSO generally performs as well as the best methods, judged by a variety of metrics. We also investigate performance characteristics of predictors trained on one genetic ancestry group when applied to another. Using LASSO, we develop a novel method for projecting AUC and correlation as a function of data size (i.e., for new biobanks) and characterize the asymptotic limit of performance. Additionally, for LASSO (compressed sensing) we show that performance metrics and predictor sparsity are in agreement with theoretical predictions from the Donoho-Tanner phase transition. Specifically, a future predictor trained in the Taiwan Precision Medicine Initiative for asthma can achieve an AUC of [Formula: see text] and for height a correlation of [Formula: see text] for a Taiwanese population. This is above the measured values of [Formula: see text] and [Formula: see text], respectively, for UK Biobank trained predictors applied to a European population.


Assuntos
Asma , Bancos de Espécimes Biológicos , Humanos , Aprendizado de Máquina , Previsões , Algoritmos
3.
Sci Rep ; 13(1): 376, 2023 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-36611071

RESUMO

We use UK Biobank and a unique IVF family dataset (including genotyped embryos) to investigate sibling variation in both phenotype and genotype. We compare phenotype (disease status, height, blood biomarkers) and genotype (polygenic scores, polygenic health index) distributions among siblings to those in the general population. As expected, the between-siblings standard deviation in polygenic scores is [Formula: see text] times smaller than in the general population, but variation is still significant. As previously demonstrated, this allows for substantial benefit from polygenic screening in IVF. Differences in sibling genotypes result from distinct recombination patterns in sexual reproduction. We develop a novel sibling-pair method for detection of recombination breaks via statistical discontinuities. The new method is used to construct a dataset of 1.44 million recombination events which may be useful in further study of meiosis.


Assuntos
Herança Multifatorial , Irmãos , Humanos , Herança Multifatorial/genética , Bancos de Espécimes Biológicos , Genótipo , Fenótipo , Recombinação Genética , Reino Unido/epidemiologia , DNA , Fertilização in vitro , Estudo de Associação Genômica Ampla
5.
Sci Rep ; 12(1): 18173, 2022 10 28.
Artigo em Inglês | MEDLINE | ID: mdl-36307513

RESUMO

We construct a polygenic health index as a weighted sum of polygenic risk scores for 20 major disease conditions, including, e.g., coronary artery disease, type 1 and 2 diabetes, schizophrenia, etc. Individual weights are determined by population-level estimates of impact on life expectancy. We validate this index in odds ratios and selection experiments using unrelated individuals and siblings (pairs and trios) from the UK Biobank. Individuals with higher index scores have decreased disease risk across almost all 20 diseases (no significant risk increases), and longer calculated life expectancy. When estimated Disability Adjusted Life Years (DALYs) are used as the performance metric, the gain from selection among ten individuals (highest index score vs average) is found to be roughly 4 DALYs. We find no statistical evidence for antagonistic trade-offs in risk reduction across these diseases. Correlations between genetic disease risks are found to be mostly positive and generally mild. These results have important implications for public health and also for fundamental issues such as pleiotropy and genetic architecture of human disease conditions.


Assuntos
Diabetes Mellitus Tipo 1 , Diabetes Mellitus Tipo 2 , Humanos , Irmãos , Herança Multifatorial , Expectativa de Vida , Comportamento de Redução do Risco , Fatores de Risco
6.
Methods Mol Biol ; 2467: 421-446, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35451785

RESUMO

Decoding the genome confers the capability to predict characteristics of the organism (phenotype) from DNA (genotype). We describe the present status and future prospects of genomic prediction of complex traits in humans. Some highly heritable complex phenotypes such as height and other quantitative traits can already be predicted with reasonable accuracy from DNA alone. For many diseases, including important common conditions such as coronary artery disease, breast cancer, type I and II diabetes, individuals with outlier polygenic scores (e.g., top few percent) have been shown to have 5 or even 10 times higher risk than average. Several psychiatric conditions such as schizophrenia and autism also fall into this category. We discuss related topics such as the genetic architecture of complex traits, sibling validation of polygenic scores, and applications to adult health, in vitro fertilization (embryo selection), and genetic engineering.


Assuntos
Estudo de Associação Genômica Ampla , Herança Multifatorial , Genômica , Genótipo , Humanos , Modelos Genéticos , Fenótipo , Polimorfismo de Nucleotídeo Único
7.
Genes (Basel) ; 12(7)2021 06 29.
Artigo em Inglês | MEDLINE | ID: mdl-34209487

RESUMO

We use UK Biobank data to train predictors for 65 blood and urine markers such as HDL, LDL, lipoprotein A, glycated haemoglobin, etc. from SNP genotype. For example, our Polygenic Score (PGS) predictor correlates ∼0.76 with lipoprotein A level, which is highly heritable and an independent risk factor for heart disease. This may be the most accurate genomic prediction of a quantitative trait that has yet been produced (specifically, for European ancestry groups). We also train predictors of common disease risk using blood and urine biomarkers alone (no DNA information); we call these predictors biomarker risk scores, BMRS. Individuals who are at high risk (e.g., odds ratio of >5× population average) can be identified for conditions such as coronary artery disease (AUC∼0.75), diabetes (AUC∼0.95), hypertension, liver and kidney problems, and cancer using biomarkers alone. Our atherosclerotic cardiovascular disease (ASCVD) predictor uses ∼10 biomarkers and performs in UKB evaluation as well as or better than the American College of Cardiology ASCVD Risk Estimator, which uses quite different inputs (age, diagnostic history, BMI, smoking status, statin usage, etc.). We compare polygenic risk scores (risk conditional on genotype: PRS) for common diseases to the risk predictors which result from the concatenation of learned functions BMRS and PGS, i.e., applying the BMRS predictors to the PGS output.


Assuntos
Aterosclerose/epidemiologia , Biomarcadores/sangue , Biomarcadores/urina , Doenças Cardiovasculares/epidemiologia , Lipoproteína(a)/sangue , Adulto , Aterosclerose/sangue , Aterosclerose/urina , Bancos de Espécimes Biológicos , Cálcio/sangue , Cálcio/urina , Doenças Cardiovasculares/sangue , Feminino , Fatores de Risco de Doenças Cardíacas , Hemoglobinas/genética , Humanos , Lipoproteínas HDL/sangue , Lipoproteínas LDL/sangue , Aprendizado de Máquina , Masculino , Pessoa de Meia-Idade , Herança Multifatorial/genética , Medição de Risco , Reino Unido/epidemiologia , Estados Unidos/epidemiologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...