Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 17 de 17
Filter
3.
Nat Med ; 30(4): 1065-1074, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38443691

ABSTRACT

Type 2 diabetes (T2D) is a multifactorial disease with substantial genetic risk, for which the underlying biological mechanisms are not fully understood. In this study, we identified multi-ancestry T2D genetic clusters by analyzing genetic data from diverse populations in 37 published T2D genome-wide association studies representing more than 1.4 million individuals. We implemented soft clustering with 650 T2D-associated genetic variants and 110 T2D-related traits, capturing known and novel T2D clusters with distinct cardiometabolic trait associations across two independent biobanks representing diverse genetic ancestral populations (African, n = 21,906; Admixed American, n = 14,410; East Asian, n =2,422; European, n = 90,093; and South Asian, n = 1,262). The 12 genetic clusters were enriched for specific single-cell regulatory regions. Several of the polygenic scores derived from the clusters differed in distribution among ancestry groups, including a significantly higher proportion of lipodystrophy-related polygenic risk in East Asian ancestry. T2D risk was equivalent at a body mass index (BMI) of 30 kg m-2 in the European subpopulation and 24.2 (22.9-25.5) kg m-2 in the East Asian subpopulation; after adjusting for cluster-specific genetic risk, the equivalent BMI threshold increased to 28.5 (27.1-30.0) kg m-2 in the East Asian group. Thus, these multi-ancestry T2D genetic clusters encompass a broader range of biological mechanisms and provide preliminary insights to explain ancestry-associated differences in T2D risk profiles.


Subject(s)
Diabetes Mellitus, Type 2 , Humans , Diabetes Mellitus, Type 2/genetics , Genome-Wide Association Study , Risk Factors , Phenotype , Multifactorial Inheritance/genetics , Genetic Predisposition to Disease/genetics
4.
Res Sq ; 2023 Oct 09.
Article in English | MEDLINE | ID: mdl-37886436

ABSTRACT

We identified genetic subtypes of type 2 diabetes (T2D) by analyzing genetic data from diverse groups, including non-European populations. We implemented soft clustering with 650 T2D-associated genetic variants, capturing known and novel T2D subtypes with distinct cardiometabolic trait associations. The twelve genetic clusters were distinctively enriched for single-cell regulatory regions. Polygenic scores derived from the clusters differed in distribution between ancestry groups, including a significantly higher proportion of lipodystrophy-related polygenic risk in East Asian ancestry. T2D risk was equivalent at a BMI of 30 kg/m2 in the European subpopulation and 24.2 (22.9-25.5) kg/m2 in the East Asian subpopulation; after adjusting for cluster-specific genetic risk, the equivalent BMI threshold increased to 28.5 (27.1-30.0) kg/m2 in the East Asian group, explaining about 75% of the difference in BMI thresholds. Thus, these multi-ancestry T2D genetic subtypes encompass a broader range of biological mechanisms and help explain ancestry-associated differences in T2D risk profiles.

5.
medRxiv ; 2023 Sep 29.
Article in English | MEDLINE | ID: mdl-37808749

ABSTRACT

We identified genetic subtypes of type 2 diabetes (T2D) by analyzing genetic data from diverse groups, including non-European populations. We implemented soft clustering with 650 T2D-associated genetic variants, capturing known and novel T2D subtypes with distinct cardiometabolic trait associations. The twelve genetic clusters were distinctively enriched for single-cell regulatory regions. Polygenic scores derived from the clusters differed in distribution between ancestry groups, including a significantly higher proportion of lipodystrophy-related polygenic risk in East Asian ancestry. T2D risk was equivalent at a BMI of 30 kg/m2 in the European subpopulation and 24.2 (22.9-25.5) kg/m2 in the East Asian subpopulation; after adjusting for cluster-specific genetic risk, the equivalent BMI threshold increased to 28.5 (27.1-30.0) kg/m2 in the East Asian group, explaining about 75% of the difference in BMI thresholds. Thus, these multi-ancestry T2D genetic subtypes encompass a broader range of biological mechanisms and help explain ancestry-associated differences in T2D risk profiles.

6.
Diabetes Care ; 46(4): 794-800, 2023 04 01.
Article in English | MEDLINE | ID: mdl-36745605

ABSTRACT

OBJECTIVE: Automated algorithms to identify individuals with type 1 diabetes using electronic health records are increasingly used in biomedical research. It is not known whether the accuracy of these algorithms differs by self-reported race. We investigated whether polygenic scores improve identification of individuals with type 1 diabetes. RESEARCH DESIGN AND METHODS: We investigated two large hospital-based biobanks (Mass General Brigham [MGB] and BioMe) and identified individuals with type 1 diabetes using an established automated algorithm. We performed medical record reviews to validate the diagnosis of type 1 diabetes. We implemented two published polygenic scores for type 1 diabetes (developed in individuals of European or African ancestry). We assessed the classification algorithm before and after incorporating polygenic scores. RESULTS: The automated algorithm was more likely to incorrectly assign a diagnosis of type 1 diabetes in self-reported non-White individuals than in self-reported White individuals (odds ratio 3.45; 95% CI 1.54-7.69; P = 0.0026). After incorporating polygenic scores into the MGB Biobank, the positive predictive value of the type 1 diabetes algorithm increased from 70 to 97% for self-reported White individuals (meaning that 97% of those predicted to have type 1 diabetes indeed had type 1 diabetes) and from 53 to 100% for self-reported non-White individuals. Similar results were found in BioMe. CONCLUSIONS: Automated phenotyping algorithms may exacerbate health disparities because of an increased risk of misclassification of individuals from underrepresented populations. Polygenic scores may be used to improve the performance of phenotyping algorithms and potentially reduce this disparity.


Subject(s)
Algorithms , Diabetes Mellitus, Type 1 , Multifactorial Inheritance , Humans , Diabetes Mellitus, Type 1/diagnosis , Diabetes Mellitus, Type 1/ethnology , Diabetes Mellitus, Type 1/genetics , Electronic Health Records , Predictive Value of Tests
7.
Diabetes Care ; 46(5): 944-952, 2023 05 01.
Article in English | MEDLINE | ID: mdl-36787958

ABSTRACT

OBJECTIVE: Quantify the impact of genetic and socioeconomic factors on risk of type 2 diabetes (T2D) and obesity. RESEARCH DESIGN AND METHODS: Among participants in the Mass General Brigham Biobank (MGBB) and UK Biobank (UKB), we used logistic regression models to calculate cross-sectional odds of T2D and obesity using 1) polygenic risk scores for T2D and BMI and 2) area-level socioeconomic risk (educational attainment) measures. The primary analysis included 26,737 participants of European genetic ancestry in MGBB with replication in UKB (N = 223,843), as well as in participants of non-European ancestry (MGBB N = 3,468; UKB N = 7,459). RESULTS: The area-level socioeconomic measure most strongly associated with both T2D and obesity was percent without a college degree, and associations with disease prevalence were independent of genetic risk (P < 0.001 for each). Moving from lowest to highest quintiles of combined genetic and socioeconomic burden more than tripled T2D (3.1% to 22.2%) and obesity (20.9% to 69.0%) prevalence. Favorable socioeconomic risk was associated with lower disease prevalence, even in those with highest genetic risk (T2D 13.0% vs. 22.2%, obesity 53.6% vs. 69.0% in lowest vs. highest socioeconomic risk quintiles). Additive effects of genetic and socioeconomic factors accounted for 13.2% and 16.7% of T2D and obesity prevalence, respectively, explained by these models. Findings were replicated in independent European and non-European ancestral populations. CONCLUSIONS: Genetic and socioeconomic factors significantly interact to increase risk of T2D and obesity. Favorable area-level socioeconomic status was associated with an almost 50% lower T2D prevalence in those with high genetic risk.


Subject(s)
Diabetes Mellitus, Type 2 , Humans , Diabetes Mellitus, Type 2/epidemiology , Diabetes Mellitus, Type 2/genetics , Prevalence , Cross-Sectional Studies , Genetic Predisposition to Disease , Obesity/epidemiology , Obesity/genetics , Obesity/complications , Risk Factors , Socioeconomic Factors
8.
Diabetologia ; 66(3): 495-507, 2023 03.
Article in English | MEDLINE | ID: mdl-36538063

ABSTRACT

AIMS/HYPOTHESIS: Type 2 diabetes is highly polygenic and influenced by multiple biological pathways. Rapid expansion in the number of type 2 diabetes loci can be leveraged to identify such pathways. METHODS: We developed a high-throughput pipeline to enable clustering of type 2 diabetes loci based on variant-trait associations. Our pipeline extracted summary statistics from genome-wide association studies (GWAS) for type 2 diabetes and related traits to generate a matrix of 323 variants × 64 trait associations and applied Bayesian non-negative matrix factorisation (bNMF) to identify genetic components of type 2 diabetes. Epigenomic enrichment analysis was performed in 28 cell types and single pancreatic cells. We generated cluster-specific polygenic scores and performed regression analysis in an independent cohort (N=25,419) to assess for clinical relevance. RESULTS: We identified ten clusters of genetic loci, recapturing the five from our prior analysis as well as novel clusters related to beta cell dysfunction, pronounced insulin secretion, and levels of alkaline phosphatase, lipoprotein A and sex hormone-binding globulin. Four clusters related to mechanisms of insulin deficiency, five to insulin resistance and one had an unclear mechanism. The clusters displayed tissue-specific epigenomic enrichment, notably with the two beta cell clusters differentially enriched in functional and stressed pancreatic beta cell states. Additionally, cluster-specific polygenic scores were differentially associated with patient clinical characteristics and outcomes. The pipeline was applied to coronary artery disease and chronic kidney disease, identifying multiple overlapping clusters with type 2 diabetes. CONCLUSIONS/INTERPRETATION: Our approach stratifies type 2 diabetes loci into physiologically interpretable genetic clusters associated with distinct tissues and clinical outcomes. The pipeline allows for efficient updating as additional GWAS become available and can be readily applied to other conditions, facilitating clinical translation of GWAS findings. Software to perform this clustering pipeline is freely available.


Subject(s)
Diabetes Mellitus, Type 2 , Humans , Diabetes Mellitus, Type 2/genetics , Genome-Wide Association Study , Genetic Predisposition to Disease/genetics , Bayes Theorem , Cluster Analysis , Polymorphism, Single Nucleotide
9.
Front Genet ; 13: 954713, 2022.
Article in English | MEDLINE | ID: mdl-36544485

ABSTRACT

Though both genetic and lifestyle factors are known to influence cardiometabolic outcomes, less attention has been given to whether lifestyle exposures can alter the association between a genetic variant and these outcomes. The Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium's Gene-Lifestyle Interactions Working Group has recently published investigations of genome-wide gene-environment interactions in large multi-ancestry meta-analyses with a focus on cigarette smoking and alcohol consumption as lifestyle factors and blood pressure and serum lipids as outcomes. Further description of the biological mechanisms underlying these statistical interactions would represent a significant advance in our understanding of gene-environment interactions, yet accessing and harmonizing individual-level genetic and 'omics data is challenging. Here, we demonstrate the coordinated use of summary-level data for gene-lifestyle interaction associations on up to 600,000 individuals, differential methylation data, and gene expression data for the characterization and prioritization of loci for future follow-up analyses. Using this approach, we identify 48 genes for which there are multiple sources of functional support for the identified gene-lifestyle interaction. We also identified five genes for which differential expression was observed by the same lifestyle factor for which a gene-lifestyle interaction was found. For instance, in gene-lifestyle interaction analysis, the T allele of rs6490056 (ALDH2) was associated with higher systolic blood pressure, and a larger effect was observed in smokers compared to non-smokers. In gene expression studies, this allele is associated with decreased expression of ALDH2, which is part of a major oxidative pathway. Other results show increased expression of ALDH2 among smokers. Oxidative stress is known to contribute to worsening blood pressure. Together these data support the hypothesis that rs6490056 reduces expression of ALDH2, which raises oxidative stress, leading to an increase in blood pressure, with a stronger effect among smokers, in whom the burden of oxidative stress is greater. Other genes for which the aggregation of data types suggest a potential mechanism include: GCNT4×current smoking (HDL), PTPRZ1×ever-smoking (HDL), SYN2×current smoking (pulse pressure), and TMEM116×ever-smoking (mean arterial pressure). This work demonstrates the utility of careful curation of summary-level data from a variety of sources to prioritize gene-lifestyle interaction loci for follow-up analyses.

10.
Commun Biol ; 5(1): 756, 2022 07 28.
Article in English | MEDLINE | ID: mdl-35902682

ABSTRACT

The genetic determinants of fasting glucose (FG) and fasting insulin (FI) have been studied mostly through genome arrays, resulting in over 100 associated variants. We extended this work with high-coverage whole genome sequencing analyses from fifteen cohorts in NHLBI's Trans-Omics for Precision Medicine (TOPMed) program. Over 23,000 non-diabetic individuals from five race-ethnicities/populations (African, Asian, European, Hispanic and Samoan) were included. Eight variants were significantly associated with FG or FI across previously identified regions MTNR1B, G6PC2, GCK, GCKR and FOXA2. We additionally characterize suggestive associations with FG or FI near previously identified SLC30A8, TCF7L2, and ADCY5 regions as well as APOB, PTPRT, and ROBO1. Functional annotation resources including the Diabetes Epigenome Atlas were compiled for each signal (chromatin states, annotation principal components, and others) to elucidate variant-to-function hypotheses. We provide a catalog of nucleotide-resolution genomic variation spanning intergenic and intronic regions creating a foundation for future sequencing-based investigations of glycemic traits.


Subject(s)
Diabetes Mellitus, Type 2 , Fasting , Diabetes Mellitus, Type 2/genetics , Glucose , Humans , Insulin/genetics , National Heart, Lung, and Blood Institute (U.S.) , Nerve Tissue Proteins/genetics , Polymorphism, Single Nucleotide , Precision Medicine , Receptors, Immunologic/genetics , United States
11.
Nat Commun ; 13(1): 3993, 2022 07 09.
Article in English | MEDLINE | ID: mdl-35810165

ABSTRACT

Gene-environment interactions represent the modification of genetic effects by environmental exposures and are critical for understanding disease and informing personalized medicine. These often induce differential phenotypic variance across genotypes; these variance-quantitative trait loci can be prioritized in a two-stage interaction detection strategy to greatly reduce the computational and statistical burden and enable testing of a broader range of exposures. We perform genome-wide variance-quantitative trait locus analysis for 20 serum cardiometabolic biomarkers by multi-ancestry meta-analysis of 350,016 unrelated participants in the UK Biobank, identifying 182 independent locus-biomarker pairs (p < 4.5×10-9). Most are concentrated in a small subset (4%) of loci with genome-wide significant main effects, and 44% replicate (p < 0.05) in the Women's Genome Health Study (N = 23,294). Next, we test each locus-biomarker pair for interaction across 2380 exposures, identifying 847 significant interactions (p < 2.4×10-7), of which 132 are independent (p < 0.05) after accounting for correlation between exposures. Specific examples demonstrate interaction of triglyceride-associated variants with distinct body mass- versus body fat-related exposures as well as genotype-specific associations between alcohol consumption and liver stress at the ADH1B gene. Our catalog of variance-quantitative trait loci and gene-environment interactions is publicly available in an online portal.


Subject(s)
Cardiovascular Diseases , Quantitative Trait Loci , Biomarkers , Cardiovascular Diseases/genetics , Female , Gene-Environment Interaction , Genome-Wide Association Study , Genotype , Humans , Phenotype , Polymorphism, Single Nucleotide , Quantitative Trait Loci/genetics
12.
Eur J Hum Genet ; 30(6): 730-739, 2022 06.
Article in English | MEDLINE | ID: mdl-35314805

ABSTRACT

The role and biological significance of gene-environment interactions in human traits and diseases remain poorly understood. To address these questions, the CHARGE Gene-Lifestyle Interactions Working Group conducted series of genome-wide interaction studies (GWIS) involving up to 610,475 individuals across four ancestries for three lipids and four blood pressure traits, while accounting for interaction effects with drinking and smoking exposures. Here we used GWIS summary statistics from these studies to decipher potential differences in genetic associations and G×E interactions across phenotype-exposure-ancestry combinations, and to derive insights on the potential mechanistic underlying G×E through in-silico functional analyses. Our analyses show first that interaction effects likely contribute to the commonly reported ancestry-specific genetic effect in complex traits, and second, that some phenotype-exposures pairs are more likely to benefit from a greater detection power when accounting for interactions. It also highlighted modest correlation between marginal and interaction effects, providing material for future methodological development and biological discussions. We also estimated contributions to phenotypic variance, including in particular the genetic heritability conditional on the exposure, and heritability partitioned across a range of functional annotations and cell types. In these analyses, we found multiple instances of potential heterogeneity of functional partitions between exposed and unexposed individuals, providing new evidence for likely exposure-specific genetic pathways. Finally, along this work, we identified potential biases in methods used to jointly meta-analyze genetic and interaction effects. We performed simulations to characterize these limitations and to provide the community with guidelines for future G×E studies.


Subject(s)
Gene-Environment Interaction , Multifactorial Inheritance , Epistasis, Genetic , Genome-Wide Association Study , Genomics , Humans , Life Style , Phenotype
13.
BMC Bioinformatics ; 21(1): 251, 2020 Jun 18.
Article in English | MEDLINE | ID: mdl-32552674

ABSTRACT

BACKGROUND: Models including an interaction term and performing a joint test of SNP and/or interaction effect are often used to discover Gene-Environment (GxE) interactions. When the environmental exposure is a binary variable, analyses from exposure-stratified models which consist of estimating genetic effect in unexposed and exposed individuals separately can be of interest. In large-scale consortia focusing on GxE interactions in which only the joint test has been performed, it may be challenging to get summary statistics from both exposure-stratified and marginal (i.e not accounting for interaction) models. RESULTS: In this work, we developed a simple framework to estimate summary statistics in each stratum of a binary exposure and in the marginal model using summary statistics from the "joint" model. We performed simulation studies to assess our estimators' accuracy and examined potential sources of bias, such as correlation between genotype and exposure and differing phenotypic variances within exposure strata. Results from these simulations highlight the high theoretical accuracy of our estimators and yield insights into the impact of potential sources of bias. We then applied our methods to real data and demonstrate our estimators' retained accuracy after filtering SNPs by sample size to mitigate potential bias. CONCLUSIONS: These analyses demonstrated the accuracy of our method in estimating both stratified and marginal summary statistics from a joint model of gene-environment interaction. In addition to facilitating the interpretation of GxE screenings, this work could be used to guide further functional analyses. We provide a user-friendly Python script to apply this strategy to real datasets. The Python script and documentation are available at https://gitlab.pasteur.fr/statistical-genetics/j2s.


Subject(s)
Gene-Environment Interaction , Joints/physiology , Humans , Models, Genetic
14.
Methods Mol Biol ; 1945: 251-264, 2019.
Article in English | MEDLINE | ID: mdl-30945250

ABSTRACT

This chapter describes the procedures necessary to create generative models of the spatial organization of cells directly from microscope images and use them to automatically provide geometries for spatial simulations of cell processes and behaviors. Such models capture the statistical variation in the overall cell architecture as well as the number, shape, size, and spatial distribution of organelles and other structures. The different steps described include preparing images, learning models, evaluating model quality, creating sampled cell geometries by various methods, and combining those geometries with biochemical model specifications to enable simulations.


Subject(s)
Cells/ultrastructure , Image Processing, Computer-Assisted/methods , Microscopy, Fluorescence/methods , Computer Simulation , Humans , Models, Biological , Organelles/ultrastructure
15.
PLoS Comput Biol ; 15(1): e1006199, 2019 01.
Article in English | MEDLINE | ID: mdl-30689627

ABSTRACT

Within influenza virus infected cells, viral genomic RNA are selectively packed into progeny virions, which predominantly contain a single copy of 8 viral RNA segments. Intersegmental RNA-RNA interactions are thought to mediate selective packaging of each viral ribonucleoprotein complex (vRNP). Clear evidence of a specific interaction network culminating in the full genomic set has yet to be identified. Using multi-color fluorescence in situ hybridization to visualize four vRNP segments within a single cell, we developed image-based models of vRNP-vRNP spatial dependence. These models were used to construct likely sequences of vRNP associations resulting in the full genomic set. Our results support the notion that selective packaging occurs during cytoplasmic transport and identifies the formation of multiple distinct vRNP sub-complexes that likely form as intermediate steps toward full genomic inclusion into a progeny virion. The methods employed demonstrate a statistically driven, model based approach applicable to other interaction and assembly problems.


Subject(s)
Genome, Viral/genetics , Influenza A virus/genetics , Virus Replication/genetics , Animals , Computational Biology , Dogs , In Situ Hybridization, Fluorescence , Influenza A virus/pathogenicity , Influenza A virus/physiology , Madin Darby Canine Kidney Cells , Models, Genetic , RNA, Viral/genetics , Virion/genetics
16.
Cytometry A ; 91(4): 326-335, 2017 04.
Article in English | MEDLINE | ID: mdl-28245335

ABSTRACT

Quantitative image analysis procedures are necessary for the automated discovery of effects of drug treatment in large collections of fluorescent micrographs. When compared to their mammalian counterparts, the effects of drug conditions on protein localization in plant species are poorly understood and underexplored. To investigate this relationship, we generated a large collection of images of single plant cells after various drug treatments. For this, protoplasts were isolated from six transgenic lines of A. thaliana expressing fluorescently tagged proteins. Eight drugs at three concentrations were applied to protoplast cultures followed by automated image acquisition. For image analysis, we developed a cell segmentation protocol for detecting drug effects using a Hough transform-based region of interest detector and a novel cross-channel texture feature descriptor. In order to determine treatment effects, we summarized differences between treated and untreated experiments with an L1 Cramér-von Mises statistic. The distribution of these statistics across all pairs of treated and untreated replicates was compared to the variation within control replicates to determine the statistical significance of observed effects. Using this pipeline, we report the dose dependent drug effects in the first high-content Arabidopsis thaliana drug screen of its kind. These results can function as a baseline for comparison to other protein organization modeling approaches in plant cells. © 2017 International Society for Advancement of Cytometry.


Subject(s)
Arabidopsis , Image Processing, Computer-Assisted/methods , Protoplasts , Arabidopsis/drug effects , Phenotype , Plants, Genetically Modified , Protoplasts/drug effects
17.
Cytometry A ; 89(7): 633-43, 2016 07.
Article in English | MEDLINE | ID: mdl-27327612

ABSTRACT

Accurate representations of cellular organization for multiple eukaryotic cell types are required for creating predictive models of dynamic cellular function. To this end, we have previously developed the CellOrganizer platform, an open source system for generative modeling of cellular components from microscopy images. CellOrganizer models capture the inherent heterogeneity in the spatial distribution, size, and quantity of different components among a cell population. Furthermore, CellOrganizer can generate quantitatively realistic synthetic images that reflect the underlying cell population. A current focus of the project is to model the complex, interdependent nature of organelle localization. We built upon previous work on developing multiple non-parametric models of organelles or structures that show punctate patterns. The previous models described the relationships between the subcellular localization of puncta and the positions of cell and nuclear membranes and microtubules. We extend these models to consider the relationship to the endoplasmic reticulum (ER), and to consider the relationship between the positions of different puncta of the same type. Our results do not suggest that the punctate patterns we examined are dependent on ER position or inter- and intra-class proximity. With these results, we built classifiers to update previous assignments of proteins to one of 11 patterns in three distinct cell lines. Our generative models demonstrate the ability to construct statistically accurate representations of puncta localization from simple cellular markers in distinct cell types, capturing the complex phenomena of cellular structure interaction with little human input. This protocol represents a novel approach to vesicular protein annotation, a field that is often neglected in high-throughput microscopy. These results suggest that spatial point process models provide useful insight with respect to the spatial dependence between cellular structures. © 2016 International Society for Advancement of Cytometry.


Subject(s)
Cells/ultrastructure , Image Processing, Computer-Assisted/methods , Models, Theoretical , Animals , Humans , Models, Biological
SELECTION OF CITATIONS
SEARCH DETAIL
...