Search | VHL Regional Portal

1.

Proteomic networks and related genetic variants associated with smoking and chronic obstructive pulmonary disease.

Konigsberg, Iain R; Vu, Thao; Liu, Weixuan; Litkowski, Elizabeth M; Pratte, Katherine A; Vargas, Luciana B; Gilmore, Niles; Abdel-Hafiz, Mohamed; Manichaikul, Ani; Cho, Michael H; Hersh, Craig P; DeMeo, Dawn L; Banaei-Kashani, Farnoush; Bowler, Russell P; Lange, Leslie A; Kechris, Katerina J.

BMC Genomics ; 25(1): 825, 2024 Sep 02.

Article in English | MEDLINE | ID: mdl-39223457

ABSTRACT

BACKGROUND: Studies have identified individual blood biomarkers associated with chronic obstructive pulmonary disease (COPD) and related phenotypes. However, complex diseases such as COPD typically involve changes in multiple molecules with interconnections that may not be captured when considering single molecular features. METHODS: Leveraging proteomic data from 3,173 COPDGene Non-Hispanic White (NHW) and African American (AA) participants, we applied sparse multiple canonical correlation network analysis (SmCCNet) to 4,776 proteins assayed on the SomaScan v4.0 platform to derive sparse networks of proteins associated with current vs. former smoking status, airflow obstruction, and emphysema quantitated from high-resolution computed tomography scans. We then used NetSHy, a dimension reduction technique leveraging network topology, to produce summary scores of each proteomic network, referred to as NetSHy scores. We next performed a genome-wide association study (GWAS) to identify variants associated with the NetSHy scores, or network quantitative trait loci (nQTLs). Finally, we evaluated the replicability of the networks in an independent cohort, SPIROMICS. RESULTS: We identified networks of 13 to 104 proteins for each phenotype and exposure in NHW and AA, and the derived NetSHy scores significantly associated with the variable of interests. Networks included known (sRAGE, ALPP, MIP1) and novel molecules (CA10, CPB1, HIS3, PXDN) and interactions involved in COPD pathogenesis. We observed 7 nQTL loci associated with NetSHy scores, 4 of which remained after conditional analysis. Networks for smoking status and emphysema, but not airflow obstruction, demonstrated a high degree of replicability across race groups and cohorts. CONCLUSIONS: In this work, we apply state-of-the-art molecular network generation and summarization approaches to proteomic data from COPDGene participants to uncover protein networks associated with COPD phenotypes. We further identify genetic associations with networks. This work discovers protein networks containing known and novel proteins and protein interactions associated with clinically relevant COPD phenotypes across race groups and cohorts.

Subject(s)

Genome-Wide Association Study , Proteomics , Pulmonary Disease, Chronic Obstructive , Smoking , Humans , Pulmonary Disease, Chronic Obstructive/genetics , Smoking/genetics , Male , Female , Middle Aged , Aged , Quantitative Trait Loci , Phenotype , Polymorphism, Single Nucleotide , Genetic Variation

2.

Differentially methylated regions interrogated for metastable epialleles associate with offspring adiposity.

Waldrop, Stephanie W; Sauder, Katherine A; Niemiec, Sierra S; Kechris, Katerina J; Yang, Ivana V; Starling, Anne P; Perng, Wei; Dabelea, Dana; Borengasser, Sarah J.

Epigenomics ; : 1-16, 2024 Sep 12.

Article in English | MEDLINE | ID: mdl-39263873

ABSTRACT

Aim: Assess if cord blood differentially methylated regions (DMRs) representing human metastable epialleles (MEs) associate with offspring adiposity in 588 maternal-infant dyads from the Colorado Health Start Study.Materials & methods: DNA methylation was assessed via the Illumina 450K array (~439,500 CpG sites). Offspring adiposity was obtained via air displacement plethysmography. Linear regression modeled the association of DMRs potentially representing MEs with adiposity.Results & conclusion: We identified two potential MEs, ZFP57, which associated with infant adiposity change and B4GALNT4, which associated with infancy and childhood adiposity change. Nine DMRs annotating to genes that annotated to MEs associated with change in offspring adiposity (false discovery rate <0.05). Methylation of approximately 80% of DMRs identified associated with decreased change in adiposity.

[Box: see text].

3.

Dynamic and prognostic proteomic associations with FEV₁ decline in chronic obstructive pulmonary disease.

Ruvuna, Lisa; Hijazi, Kahkeshan; Guzman, Daniel E; Guo, Claire; Loureiro, Joseph; Khokhlovich, Edward; Morris, Melody; Obeidat, Ma'en; Pratte, Katherine A; DiLillo, Katarina M; Sharma, Sunita; Kechris, Katerina; Anzueto, Antonio; Barjaktarevic, Igor; Bleecker, Eugene R; Casaburi, Richard; Comellas, Alejandro; Cooper, Christopher B; DeMeo, Dawn L; Foreman, Marilyn; Flenaugh, Eric L; Han, MeiLan K; Hanania, Nicola A; Hersh, Craig P; Krishnan, Jerry A; Labaki, Wassim W; Martinez, Fernando J; O'Neal, Wanda K; Paine, Robert; Peters, Stephen P; Woodruff, Prescott G; Wells, J Michael; Wendt, Christine H; Arnold, Kelly B; Barr, R Graham; Curtis, Jeffrey L; Ngo, Debby; Bowler, Russell P.

medRxiv ; 2024 Aug 08.

Article in English | MEDLINE | ID: mdl-39148837

ABSTRACT

Rationale: Identification and validation of circulating biomarkers for lung function decline in COPD remains an unmet need. Objective: Identify prognostic and dynamic plasma protein biomarkers of COPD progression. Methods: We measured plasma proteins using SomaScan from two COPD-enriched cohorts, the Subpopulations and Intermediate Outcomes Measures in COPD Study (SPIROMICS) and Genetic Epidemiology of COPD (COPDGene), and one population-based cohort, Multi-Ethnic Study of Atherosclerosis (MESA) Lung. Using SPIROMICS as a discovery cohort, linear mixed models identified baseline proteins that predicted future change in FEV1 (prognostic model) and proteins whose expression changed with change in lung function (dynamic model). Findings were replicated in COPDGene and MESA-Lung. Using the COPD-enriched cohorts, Gene Set Enrichment Analysis (GSEA) identified proteins shared between COPDGene and SPIROMICS. Metascape identified significant associated pathways. Measurements and Main Results: The prognostic model found 7 significant proteins in common (p < 0.05) among all 3 cohorts. After applying false discovery rate (adjusted p < 0.2), leptin remained significant in all three cohorts and growth hormone receptor remained significant in the two COPD cohorts. Elevated baseline levels of leptin and growth hormone receptor were associated with slower rate of decline in FEV1. Twelve proteins were nominally but not FDR significant in the dynamic model and all were distinct from the prognostic model. Metascape identified several immune related pathways unique to prognostic and dynamic proteins. Conclusion: We identified leptin as the most reproducible COPD progression biomarker. The difference between prognostic and dynamic proteins suggests disease activity signatures may be different from prognosis signatures.

4.

Smccnet 2.0: a comprehensive tool for multi-omics network inference with shiny visualization.

Liu, Weixuan; Vu, Thao; R Konigsberg, Iain; A Pratte, Katherine; Zhuang, Yonghua; Kechris, Katerina J.

BMC Bioinformatics ; 25(1): 276, 2024 Aug 24.

Article in English | MEDLINE | ID: mdl-39179997

ABSTRACT

Sparse multiple canonical correlation network analysis (SmCCNet) is a machine learning technique for integrating omics data along with a variable of interest (e.g., phenotype of complex disease), and reconstructing multi-omics networks that are specific to this variable. We present the second-generation SmCCNet (SmCCNet 2.0) that adeptly integrates single or multiple omics data types along with a quantitative or binary phenotype of interest. In addition, this new package offers a streamlined setup process that can be configured manually or automatically, ensuring a flexible and user-friendly experience. AVAILABILITY : This package is available in both CRAN: https://cran.r-project.org/web/packages/SmCCNet/index.html and Github: https://github.com/KechrisLab/SmCCNet under the MIT license. The network visualization tool is available at https://smccnet.shinyapps.io/smccnetnetwork/ .

Subject(s)

Machine Learning , Software , Genomics/methods , Gene Regulatory Networks , Computational Biology/methods , Humans , Multiomics

5.

Limitations of Clustering with PCA and Correlated Noise.

Lippitt, William; Carlson, Nichole E; Arbet, Jaron; Fingerlin, Tasha E; Maier, Lisa A; Kechris, Katerina.

J Stat Comput Simul ; 94(10): 2291-2319, 2024.

Article in English | MEDLINE | ID: mdl-39176071

ABSTRACT

It is now common to have a modest to large number of features on individuals with complex diseases. Unsupervised analyses, such as clustering with and without preprocessing by Principle Component Analysis (PCA), is widely used in practice to uncover subgroups in a sample. However, in many modern studies features are often highly correlated and noisy (e.g. SNP's, -omics, quantitative imaging markers, and electronic health record data). The practical performance of clustering approaches in these settings remains unclear. Through extensive simulations and empirical examples applying Gaussian Mixture Models and related clustering methods, we show these approaches (including variants of kmeans, VarSelLCM, HDClassifier, and Fisher-EM) can have very poor performance in many settings. We also show the poor performance is often driven by either an explicit or implicit assumption by the clustering algorithm that high variance features are relevant while lower variance features are irrelevant, called the variance as relevance assumption. We develop practical pre-processing approaches that improve analysis performance in some cases. This work offers practical guidance on the strengths and limitations of unsupervised clustering approaches in modern data analysis applications.

6.

Maternal Serum Metabolomics in Mid-Pregnancy Identifies Lipid Pathways as a Key Link to Offspring Obesity in Early Childhood.

Francis, Ellen C; Kechris, Katerina; Johnson, Randi K; Rawal, Shristi; Pathmasiri, Wimal; Rushing, Blake R; Du, Xiuxia; Jansson, Thomas; Dabelea, Dana; Sumner, Susan J; Perng, Wei.

Int J Mol Sci ; 25(14)2024 Jul 11.

Article in English | MEDLINE | ID: mdl-39062861

ABSTRACT

Maternal metabolism during pregnancy shapes offspring health via in utero programming. In the Healthy Start study, we identified five subgroups of pregnant women based on conventional metabolic biomarkers: Reference (n = 360); High HDL-C (n = 289); Dyslipidemic-High TG (n = 149); Dyslipidemic-High FFA (n = 180); Insulin Resistant (IR)-Hyperglycemic (n = 87). These subgroups not only captured metabolic heterogeneity among pregnant participants but were also associated with offspring obesity in early childhood, even among women without obesity or diabetes. Here, we utilize metabolomics data to enrich characterization of the metabolic subgroups and identify key compounds driving between-group differences. We analyzed fasting blood samples from 1065 pregnant women at 18 gestational weeks using untargeted metabolomics. We used weighted gene correlation network analysis (WGCNA) to derive a global network based on the Reference subgroup and characterized distinct metabolite modules representative of the different metabolomic profiles. We used the mummichog algorithm for pathway enrichment and identified key compounds that differed across the subgroups. Eight metabolite modules representing pathways such as the carnitine-acylcarnitine translocase system, fatty acid biosynthesis and activation, and glycerophospholipid metabolism were identified. A module that included 189 compounds related to DHA peroxidation, oxidative stress, and sex hormone biosynthesis was elevated in the Insulin Resistant-Hyperglycemic vs. the Reference subgroup. This module was positively correlated with total cholesterol (R:0.10; p-value < 0.0001) and free fatty acids (R:0.07; p-value < 0.05). Oxidative stress and inflammatory pathways may underlie insulin resistance during pregnancy, even below clinical diabetes thresholds. These findings highlight potential therapeutic targets and strategies for pregnancy risk stratification and reveal mechanisms underlying the developmental origins of metabolic disease risk.

Subject(s)

Lipid Metabolism , Metabolomics , Humans , Female , Pregnancy , Metabolomics/methods , Adult , Pediatric Obesity/blood , Pediatric Obesity/metabolism , Biomarkers/blood , Insulin Resistance , Child , Prenatal Exposure Delayed Effects/blood , Prenatal Exposure Delayed Effects/metabolism , Child, Preschool , Metabolome

7.

Longitudinal changes in DNA methylation during the onset of islet autoimmunity differentiate between reversion versus progression of islet autoimmunity.

Carry, Patrick M; Vanderlinden, Lauren A; Johnson, Randi K; Buckner, Teresa; Steck, Andrea K; Kechris, Katerina; Yang, Ivana V; Fingerlin, Tasha E; Fiehn, Oliver; Rewers, Marian; Norris, Jill M.

Front Immunol ; 15: 1345494, 2024.

Article in English | MEDLINE | ID: mdl-38915393

ABSTRACT

Background: Type 1 diabetes (T1D) is preceded by a heterogenous pre-clinical phase, islet autoimmunity (IA). We aimed to identify pre vs. post-IA seroconversion (SV) changes in DNAm that differed across three IA progression phenotypes, those who lose autoantibodies (reverters), progress to clinical T1D (progressors), or maintain autoantibody levels (maintainers). Methods: This epigenome-wide association study (EWAS) included longitudinal DNAm measurements in blood (Illumina 450K and EPIC) from participants in Diabetes Autoimmunity Study in the Young (DAISY) who developed IA, one or more islet autoantibodies on at least two consecutive visits. We compared reverters - individuals who sero-reverted, negative for all autoantibodies on at least two consecutive visits and did not develop T1D (n=41); maintainers - continued to test positive for autoantibodies but did not develop T1D (n=60); progressors - developed clinical T1D (n=42). DNAm data were measured before (pre-SV visit) and after IA (post-SV visit). Linear mixed models were used to test for differences in pre- vs post-SV changes in DNAm across the three groups. Linear mixed models were also used to test for group differences in average DNAm. Cell proportions, age, and sex were adjusted for in all models. Median follow-up across all participants was 15.5 yrs. (interquartile range (IQR): 10.8-18.7). Results: The median age at the pre-SV visit was 2.2 yrs. (IQR: 0.8-5.3) in progressors, compared to 6.0 yrs. (IQR: 1.3-8.4) in reverters, and 5.7 yrs. (IQR: 1.4-9.7) in maintainers. Median time between the visits was similar in reverters 1.4 yrs. (IQR: 1-1.9), maintainers 1.3 yrs. (IQR: 1.0-2.0), and progressors 1.8 yrs. (IQR: 1.0-2.0). Changes in DNAm, pre- vs post-SV, differed across the groups at one site (cg16066195) and 11 regions. Average DNAm (mean of pre- and post-SV) differed across 22 regions. Conclusion: Differentially changing DNAm regions were located in genomic areas related to beta cell function, immune cell differentiation, and immune cell function.

Subject(s)

Autoantibodies , Autoimmunity , DNA Methylation , Diabetes Mellitus, Type 1 , Disease Progression , Islets of Langerhans , Humans , Diabetes Mellitus, Type 1/immunology , Diabetes Mellitus, Type 1/genetics , Female , Male , Autoimmunity/genetics , Islets of Langerhans/immunology , Autoantibodies/blood , Autoantibodies/immunology , Child , Adolescent , Longitudinal Studies , Child, Preschool , Genome-Wide Association Study , Epigenesis, Genetic

8.

Proteomic Networks and Related Genetic Variants Associated with Smoking and Chronic Obstructive Pulmonary Disease.

Konigsberg, Iain R; Vu, Thao; Liu, Weixuan; Litkowski, Elizabeth M; Pratte, Katherine A; Vargas, Luciana B; Gilmore, Niles; Abdel-Hafiz, Mohamed; Manichaikul, Ani W; Cho, Michael H; Hersh, Craig P; DeMeo, Dawn L; Banaei-Kashani, Farnoush; Bowler, Russell P; Lange, Leslie A; Kechris, Katerina J.

medRxiv ; 2024 Feb 28.

Article in English | MEDLINE | ID: mdl-38464285

ABSTRACT

Background: Studies have identified individual blood biomarkers associated with chronic obstructive pulmonary disease (COPD) and related phenotypes. However, complex diseases such as COPD typically involve changes in multiple molecules with interconnections that may not be captured when considering single molecular features. Methods: Leveraging proteomic data from 3,173 COPDGene Non-Hispanic White (NHW) and African American (AA) participants, we applied sparse multiple canonical correlation network analysis (SmCCNet) to 4,776 proteins assayed on the SomaScan v4.0 platform to derive sparse networks of proteins associated with current vs. former smoking status, airflow obstruction, and emphysema quantitated from high-resolution computed tomography scans. We then used NetSHy, a dimension reduction technique leveraging network topology, to produce summary scores of each proteomic network, referred to as NetSHy scores. We next performed genome-wide association study (GWAS) to identify variants associated with the NetSHy scores, or network quantitative trait loci (nQTLs). Finally, we evaluated the replicability of the networks in an independent cohort, SPIROMICS. Results: We identified networks of 13 to 104 proteins for each phenotype and exposure in NHW and AA, and the derived NetSHy scores significantly associated with the variable of interests. Networks included known (sRAGE, ALPP, MIP1) and novel molecules (CA10, CPB1, HIS3, PXDN) and interactions involved in COPD pathogenesis. We observed 7 nQTL loci associated with NetSHy scores, 4 of which remained after conditional analysis. Networks for smoking status and emphysema, but not airflow obstruction, demonstrated a high degree of replicability across race groups and cohorts. Conclusions: In this work, we apply state-of-the-art molecular network generation and summarization approaches to proteomic data from COPDGene participants to uncover protein networks associated with COPD phenotypes. We further identify genetic associations with networks. This work discovers protein networks containing known and novel proteins and protein interactions associated with clinically relevant COPD phenotypes across race groups and cohorts.

9.

PathIntegrate: Multivariate modelling approaches for pathway-based multi-omics data integration.

Wieder, Cecilia; Cooke, Juliette; Frainay, Clement; Poupin, Nathalie; Bowler, Russell; Jourdan, Fabien; Kechris, Katerina J; Lai, Rachel Pj; Ebbels, Timothy.

PLoS Comput Biol ; 20(3): e1011814, 2024 Mar.

Article in English | MEDLINE | ID: mdl-38527092

ABSTRACT

As terabytes of multi-omics data are being generated, there is an ever-increasing need for methods facilitating the integration and interpretation of such data. Current multi-omics integration methods typically output lists, clusters, or subnetworks of molecules related to an outcome. Even with expert domain knowledge, discerning the biological processes involved is a time-consuming activity. Here we propose PathIntegrate, a method for integrating multi-omics datasets based on pathways, designed to exploit knowledge of biological systems and thus provide interpretable models for such studies. PathIntegrate employs single-sample pathway analysis to transform multi-omics datasets from the molecular to the pathway-level, and applies a predictive single-view or multi-view model to integrate the data. Model outputs include multi-omics pathways ranked by their contribution to the outcome prediction, the contribution of each omics layer, and the importance of each molecule in a pathway. Using semi-synthetic data we demonstrate the benefit of grouping molecules into pathways to detect signals in low signal-to-noise scenarios, as well as the ability of PathIntegrate to precisely identify important pathways at low effect sizes. Finally, using COPD and COVID-19 data we showcase how PathIntegrate enables convenient integration and interpretation of complex high-dimensional multi-omics datasets. PathIntegrate is available as an open-source Python package.

Subject(s)

Genomics , Multiomics , Genomics/methods

10.

A Generalized Higher-order Correlation Analysis Framework for Multi-Omics Network Inference.

Liu, Weixuan; Pratte, Katherine A; Castaldi, Peter J; Hersh, Craig; Bowler, Russell P; Banaei-Kashani, Farnoush; Kechris, Katerina J.

bioRxiv ; 2024 Jan 25.

Article in English | MEDLINE | ID: mdl-38328226

ABSTRACT

Multiple -omics (genomics, proteomics, etc.) profiles are commonly generated to gain insight into a disease or physiological system. Constructing multi-omics networks with respect to the trait(s) of interest provides an opportunity to understand relationships between molecular features but integration is challenging due to multiple data sets with high dimensionality. One approach is to use canonical correlation to integrate one or two omics types and a single trait of interest. However, these types of methods may be limited due to (1) not accounting for higher-order correlations existing among features, (2) computational inefficiency when extending to more than two omics data when using a penalty term-based sparsity method, and (3) lack of flexibility for focusing on specific correlations (e.g., omics-to-phenotype correlation versus omics-to-omics correlations). In this work, we have developed a novel multi-omics network analysis pipeline called Sparse Generalized Tensor Canonical Correlation Analysis Network Inference (SGTCCA-Net) that can effectively overcome these limitations. We also introduce an implementation to improve the summarization of networks for downstream analyses. Simulation and real-data experiments demonstrate the effectiveness of our novel method for inferring omics networks and features of interest.

11.

PathIntegrate: Multivariate modelling approaches for pathway-based multi-omics data integration.

Wieder, Cecilia; Cooke, Juliette; Frainay, Clement; Poupin, Nathalie; Bowler, Russell; Jourdan, Fabien; Kechris, Katerina J; Lai, Rachel Pj; Ebbels, Timothy.

bioRxiv ; 2024 Jan 09.

Article in English | MEDLINE | ID: mdl-38260498

ABSTRACT

As terabytes of multi-omics data are being generated, there is an ever-increasing need for methods facilitating the integration and interpretation of such data. Current multi-omics integration methods typically output lists, clusters, or subnetworks of molecules related to an outcome. Even with expert domain knowledge, discerning the biological processes involved is a time-consuming activity. Here we propose PathIntegrate, a method for integrating multi-omics datasets based on pathways, designed to exploit knowledge of biological systems and thus provide interpretable models for such studies. PathIntegrate employs single-sample pathway analysis to transform multi-omics datasets from the molecular to the pathway-level, and applies a predictive single-view or multi-view model to integrate the data. Model outputs include multi-omics pathways ranked by their contribution to the outcome prediction, the contribution of each omics layer, and the importance of each molecule in a pathway. Using semi-synthetic data we demonstrate the benefit of grouping molecules into pathways to detect signals in low signal-to-noise scenarios, as well as the ability of PathIntegrate to precisely identify important pathways at low effect sizes. Finally, using COPD and COVID-19 data we showcase how PathIntegrate enables convenient integration and interpretation of complex high-dimensional multi-omics datasets. The PathIntegrate Python package is available at https://github.com/cwieder/PathIntegrate.

12.

Cord blood DNA methylation of immune and lipid metabolism genes is associated with maternal triglycerides and child adiposity.

Waldrop, Stephanie W; Niemiec, Sierra; Wood, Cheyret; Gyllenhammer, Lauren E; Jansson, Thomas; Friedman, Jacob E; Tryggestad, Jeanie B; Borengasser, Sarah J; Davidson, Elizabeth J; Yang, Ivana V; Kechris, Katerina; Dabelea, Dana; Boyle, Kristen E.

Obesity (Silver Spring) ; 32(1): 187-199, 2024 Jan.

Article in English | MEDLINE | ID: mdl-37869908

ABSTRACT

OBJECTIVE: Fetal exposures may impact offspring epigenetic signatures and adiposity. The authors hypothesized that maternal metabolic traits associate with cord blood DNA methylation, which, in turn, associates with child adiposity. METHODS: Fasting serum was obtained in 588 pregnant women (27-34 weeks' gestation), and insulin, glucose, high-density lipoprotein cholesterol, triglycerides, and free fatty acids were measured. Cord blood DNA methylation and child adiposity were measured at birth, 4-6 months, and 4-6 years. The association of maternal metabolic traits with DNA methylation (429,246 CpGs) for differentially methylated probes (DMPs) and regions (DMRs) was tested. The association of the first principal component of each DMR with child adiposity was tested, and mediation analysis was performed. RESULTS: Maternal triglycerides were associated with the most DMPs and DMRs of all traits tested (261 and 198, respectively, false discovery rate < 0.05). DMRs were near genes involved in immune function and lipid metabolism. Triglyceride-associated CpGs were associated with child adiposity at 4-6 months (32 CpGs) and 4-6 years (2 CpGs). One, near CD226, was observed at both timepoints, mediating 10% and 22% of the relationship between maternal triglycerides and child adiposity at 4-6 months and 4-6 years, respectively. CONCLUSIONS: DNA methylation may play a role in the association of maternal triglycerides and child adiposity.

Subject(s)

Adiposity , DNA Methylation , Infant, Newborn , Child , Humans , Female , Pregnancy , Triglycerides , Adiposity/genetics , Lipid Metabolism/genetics , Fetal Blood/metabolism , Obesity/metabolism

13.

SmCCNet 2.0: A Comprehensive Tool for Multi-omics Network Inference with Shiny Visualization.

Liu, Weixuan; Vu, Thao; Konigsberg, Iain; Pratte, Katherine; Zhuang, Yonghua; Kechris, Katerina.

bioRxiv ; 2024 Apr 07.

Article in English | MEDLINE | ID: mdl-38045372

ABSTRACT

Summary: Sparse multiple canonical correlation network analysis (SmCCNet) is a machine learning technique for integrating omics data along with a variable of interest (e.g., phenotype of complex disease), and reconstructing multi-omics networks that are specific to this variable. We present the second-generation SmCCNet (SmCCNet 2.0) that adeptly integrates single or multiple omics data types along with a quantitative or binary phenotype of interest. In addition, this new package offers a streamlined setup process that can be configured manually or automatically, ensuring a flexible and user-friendly experience. Availability: This package is available in both CRAN: https://cran.r-project.org/web/packages/SmCCNet/index.html and Github: https://github.com/KechrisLab/SmCCNet under the MIT license. The network visualization tool is available at https://smccnet.shinyapps.io/smccnetnetwork/.

14.

Corrigendum to "Prenatal exposure to per- and polyfluoroalkyl substances and infant growth and adiposity: the Healthy Start Study" [Environ. Int. 131 (2019) 104983].

Starling, Anne P; Adgate, John L; Hamman, Richard F; Kechris, Katerina; Calafat, Antonia M; Dabelea, Dana.

Environ Int ; 185: 108397, 2024 Mar.

Article in English | MEDLINE | ID: mdl-38129226

15.

Erratum: "Prenatal Exposure to per- and Polyfluoroalkyl Substances, Umbilical Cord Blood DNA Methylation, and Cardio-Metabolic Indicators in Newborns: The Healthy Start Study".

Starling, Anne P; Liu, Cuining; Shen, Guannan; Yang, Ivana V; Kechris, Katerina; Borengasser, Sarah J; Boyle, Kristen E; Zhang, Weiming; Smith, Harry A; Calafat, Antonia M; Hamman, Richard F; Adgate, John L; Dabelea, Dana.

Environ Health Perspect ; 131(11): 119001, 2023 Nov.

Article in English | MEDLINE | ID: mdl-38033175

16.

TreeKernel: interpretable kernel machine tests for interactions between -omics and clinical predictors with applications to metabolomics and COPD phenotypes.

Carpenter, Charlie M; Gillenwater, Lucas; Bowler, Russell; Kechris, Katerina; Ghosh, Debashis.

BMC Bioinformatics ; 24(1): 398, 2023 Oct 25.

Article in English | MEDLINE | ID: mdl-37880571

ABSTRACT

BACKGROUND: In this paper, we are interested in interactions between a high-dimensional -omics dataset and clinical covariates. The goal is to evaluate the relationship between a phenotype of interest and a high-dimensional omics pathway, where the effect of the omics data depends on subjects' clinical covariates (age, sex, smoking status, etc.). For instance, metabolic pathways can vary greatly between sexes which may also change the relationship between certain metabolic pathways and a clinical phenotype of interest. We propose partitioning the clinical covariate space and performing a kernel association test within those partitions. To illustrate this idea, we focus on hierarchical partitions of the clinical covariate space and kernel tests on metabolic pathways. RESULTS: We see that our proposed method outperforms competing methods in most simulation scenarios. It can identify different relationships among clinical groups with higher power in most scenarios while maintaining a proper Type I error rate. The simulation studies also show a robustness to the grouping structure within the clinical space. We also apply the method to the COPDGene study and find several clinically meaningful interactions between metabolic pathways, the clinical space, and lung function. CONCLUSION: TreeKernel provides a simple and interpretable process for testing for relationships between high-dimensional omics data and clinical outcomes in the presence of interactions within clinical cohorts. The method is broadly applicable to many studies.

Subject(s)

Pulmonary Disease, Chronic Obstructive , Humans , Phenotype , Computer Simulation

17.

Accelerated epigenetic age at birth and child emotional and behavioura development in early childhood: a meta-analysis of four prospective cohort studies in ECHO.

Song, Ashley Y; Bulka, Catherine M; Niemiec, Sierra S; Kechris, Katerina; Boyle, Kristen E; Marsit, Carmen J; O'Shea, T Michael; Fry, Rebecca C; Lyall, Kristen; Fallin, M Daniele; Volk, Heather E; Ladd-Acosta, Christine.

Epigenetics ; 18(1): 2254971, 2023 12.

Article in English | MEDLINE | ID: mdl-37691382

ABSTRACT

Background: 'Epigenetic clocks' have been developed to accurately predict chronologic gestational age and have been associated with child health outcomes in prior work.Methods: We meta-analysed results from four prospective U.S cohorts investigating the association between epigenetic age acceleration estimated using blood DNA methylation collected at birth and preschool age Childhood Behavior Checklist (CBCL) scores.Results: Epigenetic ageing was not significantly associated with CBCL total problem scores (ß = 0.33, 95% CI: -0.95, 0.28) and DSM-oriented pervasive development problem scores (ß = -0.23, 95% CI: -0.61, 0.15). No associations were observed for other DSM-oriented subscales.Conclusions: The meta-analysis results suggest that epigenetic gestational age acceleration is not associated with child emotional and behavioural functioning for preschool age group. These findings may relate to our study population, which includes two cohorts enriched for ASD and one preterm birth cohort.; future work should address the role of epigenetic age in child health in other study populations.Abbreviations: DNAm: DNA methylation; CBCL: Child Behavioral Checklist; ECHO: Environmental Influences on Child Health Outcomes; EARLI: Early Autism Risk Longitudinal Investigation; MARBLES: Markers of Autism Risk in Babies - Learning Early Signs; ELGAN: Extremely Low Gestational Age Newborns; ASD: autism spectrum disorder; BMI: body mass index; DSM: Diagnostic and Statistical Manual of Mental Disorders.

Subject(s)

Autism Spectrum Disorder , Premature Birth , Child, Preschool , Humans , Infant, Newborn , DNA Methylation , Epigenesis, Genetic , Prospective Studies

18.

Quantifying the spatial clustering characteristics of radiographic emphysema explains variability in pulmonary function.

Vestal, Brian E; Ghosh, Debashis; Estépar, Raúl San José; Kechris, Katerina; Fingerlin, Tasha; Carlson, Nichole E.

Sci Rep ; 13(1): 13862, 2023 08 24.

Article in English | MEDLINE | ID: mdl-37620507

ABSTRACT

Quantitative assessment of emphysema in CT scans has mostly focused on calculating the percentage of lung tissue that is deemed abnormal based on a density thresholding strategy. However, this overall measure of disease burden discards virtually all the spatial information encoded in the scan that is implicitly utilized in a visual assessment. This simplification is likely grouping heterogenous disease patterns and is potentially obscuring clinical phenotypes and variable disease outcomes. To overcome this, several methods that attempt to quantify heterogeneity in emphysema distribution have been proposed. Here, we compare three of those: one based on estimating a power law for the size distribution of contiguous emphysema clusters, a second that looks at the number of emphysema-to-emphysema voxel adjacencies, and a third that applies a parametric spatial point process model to the emphysema voxel locations. This was done using data from 587 individuals from Phase 1 of COPDGene that had an inspiratory CT scan and plasma protein abundance measurements. The associations between these imaging metrics and visual assessment with clinical measures (FEV[Formula: see text], FEV[Formula: see text]-FVC ratio, etc.) and plasma protein biomarker levels were evaluated using a variety of regression models. Our results showed that a selection of spatial measures had the ability to discern heterogeneous patterns among CTs that had similar emphysema burdens. The most informative quantitative measure, average cluster size from the point process model, showed much stronger associations with nearly every clinical outcome examined than existing CT-derived emphysema metrics and visual assessment. Moreover, approximately 75% more plasma biomarkers were found to be associated with an emphysema heterogeneity phenotype when accounting for spatial clustering measures than when they were excluded.

Subject(s)

Emphysema , Pulmonary Emphysema , Humans , Pulmonary Emphysema/diagnostic imaging , Emphysema/diagnostic imaging , Benchmarking , Lung/diagnostic imaging , Cluster Analysis

19.

Adipocyte hypertrophy in mesenchymal stem cells from infants of mothers with obesity.

Keleher, Madeline Rose; Shubhangi, Shreya; Brown, Asya; Duensing, Allison M; Lixandrão, Manoel E; Gavin, Kathleen M; Smith, Harry A; Kechris, Katerina J; Yang, Ivana V; Dabelea, Dana; Boyle, Kristen E.

Obesity (Silver Spring) ; 31(8): 2090-2102, 2023 08.

Article in English | MEDLINE | ID: mdl-37475691

ABSTRACT

OBJECTIVE: Fat content of adipocytes derived from infant umbilical cord mesenchymal stem cells (MSCs) predicts adiposity in children through 4 to 6 years of age. This study tested the hypothesis that MSCs from infants born to mothers with obesity (Ob-MSCs) exhibit adipocyte hypertrophy and perturbations in genes regulating adipogenesis compared with MSCs from infants of mothers with normal weight (NW-MSCs). METHODS: Adipogenesis was induced in MSCs embedded in three-dimensional hydrogel structures, and cell size and number were measured by three-dimensional imaging. Proliferation and protein markers of proliferation and adipogenesis in undifferentiated and adipocyte differentiating cells were measured. RNA sequencing was performed to determine pathways linked to adipogenesis phenotype. RESULTS: In undifferentiated MSCs, greater zinc finger protein (Zfp)423 protein content was observed in Ob- versus NW-MSCs. Adipocytes from Ob-MSCs were larger but fewer than adipocytes from NW-MSCs. RNA sequencing analysis showed that Zfp423 protein correlated with mRNA expression of genes enriched for cell cycle, MSC lineage specification, inflammation, and metabolism pathways. MSC proliferation was not different before differentiation but declined faster in Ob-MSCs upon adipogenic induction. CONCLUSIONS: Ob-MSCs have an intrinsic propensity for adipocyte hypertrophy and reduced hyperplasia during adipogenesis in vitro, perhaps linked to greater Zfp423 content and changes in cell cycle pathway gene expression.

Subject(s)

Mesenchymal Stem Cells , Mothers , Female , Humans , Obesity/genetics , Obesity/metabolism , Cell Differentiation/genetics , Adipogenesis/genetics , Mesenchymal Stem Cells/metabolism , Transcription Factors/metabolism , Adipocytes/metabolism , Hypertrophy/metabolism

20.

Large scale proteomic studies create novel privacy considerations.

Hill, Andrew C; Guo, Claire; Litkowski, Elizabeth M; Manichaikul, Ani W; Yu, Bing; Konigsberg, Iain R; Gorbet, Betty A; Lange, Leslie A; Pratte, Katherine A; Kechris, Katerina J; DeCamp, Matthew; Coors, Marilyn; Ortega, Victor E; Rich, Stephen S; Rotter, Jerome I; Gerzsten, Robert E; Clish, Clary B; Curtis, Jeffrey L; Hu, Xiaowei; Obeidat, Ma-En; Morris, Melody; Loureiro, Joseph; Ngo, Debby; O'Neal, Wanda K; Meyers, Deborah A; Bleecker, Eugene R; Hobbs, Brian D; Cho, Michael H; Banaei-Kashani, Farnoush; Bowler, Russell P.

Sci Rep ; 13(1): 9254, 2023 06 07.

Article in English | MEDLINE | ID: mdl-37286633

ABSTRACT

Privacy protection is a core principle of genomic but not proteomic research. We identified independent single nucleotide polymorphism (SNP) quantitative trait loci (pQTL) from COPDGene and Jackson Heart Study (JHS), calculated continuous protein level genotype probabilities, and then applied a naïve Bayesian approach to link SomaScan 1.3K proteomes to genomes for 2812 independent subjects from COPDGene, JHS, SubPopulations and InteRmediate Outcome Measures In COPD Study (SPIROMICS) and Multi-Ethnic Study of Atherosclerosis (MESA). We correctly linked 90-95% of proteomes to their correct genome and for 95-99% we identify the 1% most likely links. The linking accuracy in subjects with African ancestry was lower (~ 60%) unless training included diverse subjects. With larger profiling (SomaScan 5K) in the Atherosclerosis Risk Communities (ARIC) correct identification was > 99% even in mixed ancestry populations. We also linked proteomes-to-proteomes and used the proteome only to determine features such as sex, ancestry, and first-degree relatives. When serial proteomes are available, the linking algorithm can be used to identify and correct mislabeled samples. This work also demonstrates the importance of including diverse populations in omics research and that large proteomic datasets (> 1000 proteins) can be accurately linked to a specific genome through pQTL knowledge and should not be considered unidentifiable.

Subject(s)

Atherosclerosis , Proteome , Humans , Proteome/genetics , Bayes Theorem , Privacy , Genome-Wide Association Study , Atherosclerosis/genetics , Polymorphism, Single Nucleotide

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL