Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
1.
Nat Commun ; 15(1): 989, 2024 Feb 02.
Artigo em Inglês | MEDLINE | ID: mdl-38307861

RESUMO

Proteogenomics studies generate hypotheses on protein function and provide genetic evidence for drug target prioritization. Most previous work has been conducted using affinity-based proteomics approaches. These technologies face challenges, such as uncertainty regarding target identity, non-specific binding, and handling of variants that affect epitope affinity binding. Mass spectrometry-based proteomics can overcome some of these challenges. Here we report a pQTL study using the Proteograph™ Product Suite workflow (Seer, Inc.) where we quantify over 18,000 unique peptides from nearly 3000 proteins in more than 320 blood samples from a multi-ethnic cohort in a bottom-up, peptide-centric, mass spectrometry-based proteomics approach. We identify 184 protein-altering variants in 137 genes that are significantly associated with their corresponding variant peptides, confirming target specificity of co-associated affinity binders, identifying putatively causal cis-encoded proteins and providing experimental evidence for their presence in blood, including proteins that may be inaccessible to affinity-based proteomics.


Assuntos
Proteogenômica , Proteômica , Humanos , Proteômica/métodos , Espectrometria de Massas/métodos , Proteínas/análise , Peptídeos/análise , Proteogenômica/métodos , Proteínas Mutantes
2.
Am J Hum Genet ; 108(12): 2354-2367, 2021 12 02.
Artigo em Inglês | MEDLINE | ID: mdl-34822764

RESUMO

Whole-genome sequencing studies applied to large populations or biobanks with extensive phenotyping raise new analytic challenges. The need to consider many variants at a locus or group of genes simultaneously and the potential to study many correlated phenotypes with shared genetic architecture provide opportunities for discovery not addressed by the traditional one variant, one phenotype association study. Here, we introduce a Bayesian model comparison approach called MRP (multiple rare variants and phenotypes) for rare-variant association studies that considers correlation, scale, and direction of genetic effects across a group of genetic variants, phenotypes, and studies, requiring only summary statistic data. We apply our method to exome sequencing data (n = 184,698) across 2,019 traits from the UK Biobank, aggregating signals in genes. MRP demonstrates an ability to recover signals such as associations between PCSK9 and LDL cholesterol levels. We additionally find MRP effective in conducting meta-analyses in exome data. Non-biomarker findings include associations between MC1R and red hair color and skin color, IL17RA and monocyte count, and IQGAP2 and mean platelet volume. Finally, we apply MRP in a multi-phenotype setting; after clustering the 35 biomarker phenotypes based on genetic correlation estimates, we find that joint analysis of these phenotypes results in substantial power gains for gene-trait associations, such as in TNFRSF13B in one of the clusters containing diabetes- and lipid-related traits. Overall, we show that the MRP model comparison approach improves upon useful features from widely used meta-analysis approaches for rare-variant association analyses and prioritizes protective modifiers of disease risk.


Assuntos
Variação Genética , Estudo de Associação Genômica Ampla , Modelos Genéticos , Teorema de Bayes , Feminino , Humanos , Masculino , Fenótipo
4.
Eur J Hum Genet ; 29(7): 1071-1081, 2021 07.
Artigo em Inglês | MEDLINE | ID: mdl-33558700

RESUMO

Polygenic risk models have led to significant advances in understanding complex diseases and their clinical presentation. While polygenic risk scores (PRS) can effectively predict outcomes, they do not generally account for disease subtypes or pathways which underlie within-trait diversity. Here, we introduce a latent factor model of genetic risk based on components from Decomposition of Genetic Associations (DeGAs), which we call the DeGAs polygenic risk score (dPRS). We compute DeGAs using genetic associations for 977 traits and find that dPRS performs comparably to standard PRS while offering greater interpretability. We show how to decompose an individual's genetic risk for a trait across DeGAs components, with examples for body mass index (BMI) and myocardial infarction (heart attack) in 337,151 white British individuals in the UK Biobank, with replication in a further set of 25,486 non-British white individuals. We find that BMI polygenic risk factorizes into components related to fat-free mass, fat mass, and overall health indicators like physical activity. Most individuals with high dPRS for BMI have strong contributions from both a fat-mass component and a fat-free mass component, whereas a few "outlier" individuals have strong contributions from only one of the two components. Overall, our method enables fine-scale interpretation of the drivers of genetic risk for complex traits.


Assuntos
Estudos de Associação Genética , Predisposição Genética para Doença , Herança Multifatorial , Característica Quantitativa Herdável , Algoritmos , Bancos de Espécimes Biológicos , Bases de Dados Genéticas , Estudos de Associação Genética/métodos , Estudo de Associação Genômica Ampla , Humanos , Modelos Genéticos , Fenótipo , Vigilância da População , Reprodutibilidade dos Testes , Medição de Risco , Fatores de Risco , Reino Unido/epidemiologia
5.
Nat Genet ; 53(2): 185-194, 2021 02.
Artigo em Inglês | MEDLINE | ID: mdl-33462484

RESUMO

Clinical laboratory tests are a critical component of the continuum of care. We evaluate the genetic basis of 35 blood and urine laboratory measurements in the UK Biobank (n = 363,228 individuals). We identify 1,857 loci associated with at least one trait, containing 3,374 fine-mapped associations and additional sets of large-effect (>0.1 s.d.) protein-altering, human leukocyte antigen (HLA) and copy number variant (CNV) associations. Through Mendelian randomization (MR) analysis, we discover 51 causal relationships, including previously known agonistic effects of urate on gout and cystatin C on stroke. Finally, we develop polygenic risk scores (PRSs) for each biomarker and build 'multi-PRS' models for diseases using 35 PRSs simultaneously, which improved chronic kidney disease, type 2 diabetes, gout and alcoholic cirrhosis genetic risk stratification in an independent dataset (FinnGen; n = 135,500) relative to single-disease PRSs. Together, our results delineate the genetic basis of biomarkers and their causal influences on diseases and improve genetic risk stratification for common diseases.


Assuntos
Biomarcadores/sangue , Biomarcadores/urina , Antígenos HLA/genética , Proteínas/genética , Bancos de Espécimes Biológicos , Doenças Cardiovasculares/genética , Doenças Cardiovasculares/metabolismo , Variações do Número de Cópias de DNA , Diabetes Mellitus Tipo 2/genética , Diabetes Mellitus Tipo 2/metabolismo , Pleiotropia Genética , Humanos , Desequilíbrio de Ligação , Transportador 1 de Ânion Orgânico Específico do Fígado/genética , Análise da Randomização Mendeliana , Polimorfismo de Nucleotídeo Único , Insuficiência Renal Crônica , Serina Endopeptidases/genética , Reino Unido
6.
PLoS One ; 15(6): e0234647, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32569327

RESUMO

Unstructured clinical narratives are continuously being recorded as part of delivery of care in electronic health records, and dedicated tagging staff spend considerable effort manually assigning clinical codes for billing purposes. Despite these efforts, however, label availability and accuracy are both suboptimal. In this retrospective study, we aimed to automate the assignment of top-level International Classification of Diseases version 9 (ICD-9) codes to clinical records from human and veterinary data stores using minimal manual labor and feature curation. Automating top-level annotations could in turn enable rapid cohort identification, especially in a veterinary setting. To this end, we trained long short-term memory (LSTM) recurrent neural networks (RNNs) on 52,722 human and 89,591 veterinary records. We investigated the accuracy of both separate-domain and combined-domain models and probed model portability. We established relevant baseline classification performances by training Decision Trees (DT) and Random Forests (RF). We also investigated whether transforming the data using MetaMap Lite, a clinical natural language processing tool, affected classification performance. We showed that the LSTM-RNNs accurately classify veterinary and human text narratives into top-level categories with an average weighted macro F1 score of 0.74 and 0.68 respectively. In the "neoplasia" category, the model trained on veterinary data had a high validation accuracy in veterinary data and moderate accuracy in human data, with F1 scores of 0.91 and 0.70 respectively. Our LSTM method scored slightly higher than that of the DT and RF models. The use of LSTM-RNN models represents a scalable structure that could prove useful in cohort identification for comparative oncology studies. Digitization of human and veterinary health information will continue to be a reality, particularly in the form of unstructured narratives. Our approach is a step forward for these two domains to learn from and inform one another.


Assuntos
Mineração de Dados , Medicina Narrativa , Software , Animais , Automação , Bases de Dados como Assunto , Humanos , Reprodutibilidade dos Testes , Especificidade da Espécie
7.
Pac Symp Biocomput ; 22: 521-532, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-27897003

RESUMO

Autism has been shown to have a major genetic risk component; the architecture of documented autism in families has been over and again shown to be passed down for generations. While inherited risk plays an important role in the autistic nature of children, de novo (germline) mutations have also been implicated in autism risk. Here we find that autism de novo variants verified and published in the literature are Bonferroni-significantly enriched in a gene set implicated in synaptic elimination. Additionally, several of the genes in this synaptic elimination set that were enriched in protein-protein interactions (CACNA1C, SHANK2, SYNGAP1, NLGN3, NRXN1, and PTEN) have been previously confirmed as genes that confer risk for the disorder. The results demonstrate that autism-associated de novos are linked to proper synaptic pruning and density, hinting at the etiology of autism and suggesting pathophysiology for downstream correction and treatment.


Assuntos
Transtorno Autístico/genética , Mutação em Linhagem Germinativa , Transtorno Autístico/patologia , Biologia Computacional , Bases de Dados Genéticas , Sinapses Elétricas/genética , Sinapses Elétricas/patologia , Feminino , Redes Reguladoras de Genes , Predisposição Genética para Doença , Humanos , Masculino , Modelos Genéticos , Modelos Neurológicos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...