Search | VHL Regional Portal

Characterizing substructure via mixture modeling in large-scale genetic summary statistics.

Stoneman, Hayley R; Price, Adelle; Trout, Nikole Scribner; Lamont, Riley; Tifour, Souha; Pozdeyev, Nikita; Crooks, Kristy; Lin, Meng; Rafaels, Nicholas; Gignoux, Christopher R; Marker, Katie M; Hendricks, Audrey E.

bioRxiv ; 2024 May 13.

Article in English | MEDLINE | ID: mdl-38766180

ABSTRACT

Genetic summary data are broadly accessible and highly useful including for risk prediction, causal inference, fine mapping, and incorporation of external controls. However, collapsing individual-level data into groups masks intra- and inter-sample heterogeneity, leading to confounding, reduced power, and bias. Ultimately, unaccounted substructure limits summary data usability, especially for understudied or admixed populations. Here, we present Summix2, a comprehensive set of methods and software based on a computationally efficient mixture model to estimate and adjust for substructure in genetic summary data. In extensive simulations and application to public data, Summix2 characterizes finer-scale population structure, identifies ascertainment bias, and identifies potential regions of selection due to local substructure deviation. Summix2 increases the robust use of diverse publicly available summary data resulting in improved and more equitable research.

The importance of studying genetic ancestry in eosinophilic esophagitis.

Marker, Katie M; Mathias, Rasika A; Gignoux, Christopher R.

J Allergy Clin Immunol ; 151(5): 1244-1245, 2023 05.

Article in English | MEDLINE | ID: mdl-36963618

Subject(s)

Enteritis , Eosinophilic Esophagitis , Gastritis , Humans , Eosinophilic Esophagitis/genetics

COVID-19 Mortality in the Colorado Center for Personalized Medicine Biobank.

Brice, Amanda N; Vanderlinden, Lauren A; Marker, Katie M; Mayer, David; Lin, Meng; Rafaels, Nicholas; Shortt, Jonathan A; Romero, Alex; Lowery, Jan T; Gignoux, Christopher R; Johnson, Randi K.

Int J Environ Res Public Health ; 20(3)2023 01 29.

Article in English | MEDLINE | ID: mdl-36767733

ABSTRACT

Over 6.37 million people have died from COVID-19 worldwide, but factors influencing COVID-19-related mortality remain understudied. We aimed to describe and identify risk factors for COVID-19 mortality in the Colorado Center for Personalized Medicine (CCPM) Biobank using integrated data sources, including Electronic Health Records (EHRs). We calculated cause-specific mortality and case-fatality rates for COVID-19 and common pre-existing health conditions defined by diagnostic phecodes and encounters in EHRs. We performed multivariable logistic regression analyses of the association between each pre-existing condition and COVID-19 mortality. Of the 155,859 Biobank participants enrolled as of July 2022, 20,797 had been diagnosed with COVID-19. Of 5334 Biobank participants who had died, 190 were attributed to COVID-19. The case-fatality rate was 0.91% and the COVID-19 mortality rate was 122 per 100,000 persons. The odds of dying from COVID-19 were significantly increased among older men, and those with 14 of the 61 pre-existing conditions tested, including hypertensive chronic kidney disease (OR: 10.14, 95% CI: 5.48, 19.16) and type 2 diabetes with renal manifestations (OR: 5.59, 95% CI: 3.42, 8.97). Male patients who are older and have pre-existing kidney diseases may be at higher risk for death from COVID-19 and may require special care.

Subject(s)

COVID-19 , Diabetes Mellitus, Type 2 , Humans , Male , Aged , Diabetes Mellitus, Type 2/epidemiology , SARS-CoV-2 , Colorado/epidemiology , Biological Specimen Banks , Precision Medicine , Risk Factors

COVID-19 Surveillance in the Biobank at the Colorado Center for Personalized Medicine: Observational Study.

Johnson, Randi K; Marker, Katie M; Mayer, David; Shortt, Jonathan; Kao, David; Barnes, Kathleen C; Lowery, Jan T; Gignoux, Christopher R.

JMIR Public Health Surveill ; 8(6): e37327, 2022 06 13.

Article in English | MEDLINE | ID: mdl-35486493

ABSTRACT

BACKGROUND: Characterizing the experience and impact of the COVID-19 pandemic among various populations remains challenging due to the limitations inherent in common data sources, such as electronic health records (EHRs) or cross-sectional surveys. OBJECTIVE: This study aims to describe testing behaviors, symptoms, impact, vaccination status, and case ascertainment during the COVID-19 pandemic using integrated data sources. METHODS: In summer 2020 and 2021, we surveyed participants enrolled in the Biobank at the Colorado Center for Personalized Medicine (CCPM; N=180,599) about their experience with COVID-19. The prevalence of testing, symptoms, and impacts of COVID-19 on employment, family life, and physical and mental health were calculated overall and by demographic categories. Survey respondents who reported receiving a positive COVID-19 test result were considered a "confirmed case" of COVID-19. Using EHRs, we compared COVID-19 case ascertainment and characteristics in EHRs versus the survey. Positive cases were identified in EHRs using the International Statistical Classification of Diseases, 10th revision (ICD-10) diagnosis codes, health care encounter types, and encounter primary diagnoses. RESULTS: Of the 25,063 (13.9%) survey respondents, 10,661 (42.5%) had been tested for COVID-19, and of those, 1366 (12.8%) tested positive. Nearly half of those tested had symptoms or had been exposed to someone who was infected. Young adults (18-29 years) and Hispanics were more likely to have positive tests compared to older adults and persons of other racial/ethnic groups. Mental health (n=13,688, 54.6%) and family life (n=12,233, 48.8%) were most negatively affected by the pandemic and more so among younger groups and women; negative impacts on employment were more commonly reported among Black respondents. Of the 10,249 individuals who responded to vaccination questions from version 2 of the survey (summer 2021), 9770 (95.3%) had received the vaccine. After integration with EHR data up to the time of the survey completion, 1006 (4%) of the survey respondents had a discordant COVID-19 case status between EHRs and the survey. Using all longitudinal EHR and survey data, we identified 11,472 (6.4%) COVID-19-positive cases among Biobank participants. In comparison to COVID-19 cases identified through the survey, EHR-identified cases were younger and more likely to be Hispanic. CONCLUSIONS: We found that the COVID-19 pandemic has had far-reaching and varying effects among our Biobank participants. Integrated data assets, such as the Biobank at the CCPM, are key resources for population health monitoring in response to public health emergencies, such as the COVID-19 pandemic.

Subject(s)

COVID-19 , Aged , Biological Specimen Banks , COVID-19/epidemiology , Colorado/epidemiology , Cross-Sectional Studies , Female , Humans , Pandemics , Precision Medicine , Young Adult

Summix: A method for detecting and adjusting for population structure in genetic summary data.

Arriaga-MacKenzie, Ian S; Matesi, Gregory; Chen, Samuel; Ronco, Alexandria; Marker, Katie M; Hall, Jordan R; Scherenberg, Ryan; Khajeh-Sharafabadi, Mobin; Wu, Yinfei; Gignoux, Christopher R; Null, Megan; Hendricks, Audrey E.

Am J Hum Genet ; 108(7): 1270-1282, 2021 07 01.

Article in English | MEDLINE | ID: mdl-34157305

ABSTRACT

Publicly available genetic summary data have high utility in research and the clinic, including prioritizing putative causal variants, polygenic scoring, and leveraging common controls. However, summarizing individual-level data can mask population structure, resulting in confounding, reduced power, and incorrect prioritization of putative causal variants. This limits the utility of publicly available data, especially for understudied or admixed populations where additional research and resources are most needed. Although several methods exist to estimate ancestry in individual-level data, methods to estimate ancestry proportions in summary data are lacking. Here, we present Summix, a method to efficiently deconvolute ancestry and provide ancestry-adjusted allele frequencies (AFs) from summary data. Using continental reference ancestry, African (AFR), non-Finnish European (EUR), East Asian (EAS), Indigenous American (IAM), South Asian (SAS), we obtain accurate and precise estimates (within 0.1%) for all simulation scenarios. We apply Summix to gnomAD v.2.1 exome and genome groups and subgroups, finding heterogeneous continental ancestry for several groups, including African/African American (â¼84% AFR, â¼14% EUR) and American/Latinx (â¼4% AFR, â¼5% EAS, â¼43% EUR, â¼46% IAM). Compared to the unadjusted gnomAD AFs, Summix's ancestry-adjusted AFs more closely match respective African and Latinx reference samples. Even on modern, dense panels of summary statistics, Summix yields results in seconds, allowing for estimation of confidence intervals via block bootstrap. Given an accompanying R package, Summix increases the utility and equity of public genetic resources, empowering novel research opportunities.

Subject(s)

Data Interpretation, Statistical , Metagenomics/methods , Pedigree , Racial Groups/genetics , Alleles , Computer Simulation , Gene Frequency , Humans , Inheritance Patterns , Software

Human Epidermal Growth Factor Receptor 2-Positive Breast Cancer Is Associated with Indigenous American Ancestry in Latin American Women.

Marker, Katie M; Zavala, Valentina A; Vidaurre, Tatiana; Lott, Paul C; Vásquez, Jeannie Navarro; Casavilca-Zambrano, Sandro; Calderón, Mónica; Abugattas, Julio E; Gómez, Henry L; Fuentes, Hugo A; Picoaga, Ruddy Liendo; Cotrina, Jose M; Neciosup, Silvia P; Castañeda, Carlos A; Morante, Zaida; Valencia, Fernando; Torres, Javier; Echeverry, Magdalena; Bohórquez, Mabel E; Polanco-Echeverry, Guadalupe; Estrada-Florez, Ana P; Serrano-Gómez, Silvia J; Carmona-Valencia, Jenny A; Alvarado-Cabrero, Isabel; Sanabria-Salas, María Carolina; Velez, Alejandro; Donado, Jorge; Song, Sikai; Cherry, Daniel; Tamayo, Lizeth I; Huntsman, Scott; Hu, Donglei; Ruiz-Cordero, Roberto; Balassanian, Ronald; Ziv, Elad; Zabaleta, Jovanny; Carvajal-Carmona, Luis; Fejerman, Laura.

Cancer Res ; 80(9): 1893-1901, 2020 05 01.

Article in English | MEDLINE | ID: mdl-32245796

ABSTRACT

Women of Latin American origin in the United States are more likely to be diagnosed with advanced breast cancer and have a higher risk of mortality than non-Hispanic White women. Studies in U.S. Latinas and Latin American women have reported a high incidence of HER2 positive (+) tumors; however, the factors contributing to this observation are unknown. Genome-wide genotype data for 1,312 patients from the Peruvian Genetics and Genomics of Breast Cancer Study (PEGEN-BC) were used to estimate genetic ancestry. We tested the association between HER2 status and genetic ancestry using logistic and multinomial logistic regression models. Findings were replicated in 616 samples from Mexico and Colombia. Average Indigenous American (IA) ancestry differed by subtype. In multivariate models, the odds of having an HER2+ tumor increased by a factor of 1.20 with every 10% increase in IA ancestry proportion (95% CI, 1.07-1.35; P = 0.001). The association between HER2 status and IA ancestry was independently replicated in samples from Mexico and Colombia. Results suggest that the high prevalence of HER2+ tumors in Latinas could be due in part to the presence of population-specific genetic variant(s) affecting HER2 expression in breast cancer. SIGNIFICANCE: The positive association between Indigenous American genetic ancestry and HER2+ breast cancer suggests that the high incidence of HER2+ subtypes in Latinas might be due to population and subtype-specific genetic risk variants.

Subject(s)

Breast Neoplasms/chemistry , Breast Neoplasms/ethnology , Hispanic or Latino/genetics , Receptor, ErbB-2/analysis , Adult , Aged , Asian People/ethnology , Asian People/statistics & numerical data , Black People/ethnology , Black People/statistics & numerical data , Breast Neoplasms/genetics , Colombia/ethnology , Female , Humans , Indians, North American , Indians, South American , Latin America/ethnology , Linear Models , Logistic Models , Mexico/ethnology , Middle Aged , Peru/ethnology , Receptor, ErbB-2/genetics , Receptors, Estrogen/blood , Receptors, Progesterone/blood , United States , White People/ethnology , White People/statistics & numerical data , Young Adult

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL