Search | VHL Regional Portal

Statistical integration of multi-omics and drug screening data from cell lines.

El Bouhaddani, Said; Höllerhage, Matthias; Uh, Hae-Won; Moebius, Claudia; Bickle, Marc; Höglinger, Günter; Houwing-Duistermaat, Jeanine.

PLoS Comput Biol ; 20(1): e1011809, 2024 Jan.

Article in English | MEDLINE | ID: mdl-38295113

ABSTRACT

Data integration methods are used to obtain a unified summary of multiple datasets. For multi-modal data, we propose a computational workflow to jointly analyze datasets from cell lines. The workflow comprises a novel probabilistic data integration method, named POPLS-DA, for multi-omics data. The workflow is motivated by a study on synucleinopathies where transcriptomics, proteomics, and drug screening data are measured in affected LUHMES cell lines and controls. The aim is to highlight potentially druggable pathways and genes involved in synucleinopathies. First, POPLS-DA is used to prioritize genes and proteins that best distinguish cases and controls. For these genes, an integrated interaction network is constructed where the drug screen data is incorporated to highlight druggable genes and pathways in the network. Finally, functional enrichment analyses are performed to identify clusters of synaptic and lysosome-related genes and proteins targeted by the protective drugs. POPLS-DA is compared to other single- and multi-omics approaches. We found that HSPA5, a member of the heat shock protein 70 family, was one of the most targeted genes by the validated drugs, in particular by AT1-blockers. HSPA5 and AT1-blockers have been previously linked to α-synuclein pathology and Parkinson's disease, showing the relevance of our findings. Our computational workflow identified new directions for therapeutic targets for synucleinopathies. POPLS-DA provided a larger interpretable gene set than other single- and multi-omic approaches. An implementation based on R and markdown is freely available online.

Subject(s)

Computational Biology , Synucleinopathies , Humans , Computational Biology/methods , Multiomics , Drug Evaluation, Preclinical , Proteomics/methods

Digital Twins in Healthcare: Methodological Challenges and Opportunities.

Meijer, Charles; Uh, Hae-Won; El Bouhaddani, Said.

J Pers Med ; 13(10)2023 Oct 23.

Article in English | MEDLINE | ID: mdl-37888133

ABSTRACT

One of the most promising advancements in healthcare is the application of digital twin technology, offering valuable applications in monitoring, diagnosis, and development of treatment strategies tailored to individual patients. Furthermore, digital twins could also be helpful in finding novel treatment targets and predicting the effects of drugs and other chemical substances in development. In this review article, we consider digital twins as virtual counterparts of real human patients. The primary aim of this narrative review is to give an in-depth look into the various data sources and methodologies that contribute to the construction of digital twins across several healthcare domains. Each data source, including blood glucose levels, heart MRI and CT scans, cardiac electrophysiology, written reports, and multi-omics data, comes with different challenges regarding standardization, integration, and interpretation. We showcase how various datasets and methods are used to overcome these obstacles and generate a digital twin. While digital twin technology has seen significant progress, there are still hurdles in the way to achieving a fully comprehensive patient digital twin. Developments in non-invasive and high-throughput data collection, as well as advancements in modeling and computational power will be crucial to improve digital twin systems. We discuss a few critical developments in light of the current state of digital twin technology. Despite challenges, digital twin research holds great promise for personalized patient care and has the potential to shape the future of healthcare innovation.

Artificial intelligence to enhance clinical value across the spectrum of cardiovascular healthcare.

Gill, Simrat K; Karwath, Andreas; Uh, Hae-Won; Cardoso, Victor Roth; Gu, Zhujie; Barsky, Andrey; Slater, Luke; Acharjee, Animesh; Duan, Jinming; Dall'Olio, Lorenzo; El Bouhaddani, Said; Chernbumroong, Saisakul; Stanbury, Mary; Haynes, Sandra; Asselbergs, Folkert W; Grobbee, Diederick E; Eijkemans, Marinus J C; Gkoutos, Georgios V; Kotecha, Dipak.

Eur Heart J ; 44(9): 713-725, 2023 03 01.

Article in English | MEDLINE | ID: mdl-36629285

ABSTRACT

Artificial intelligence (AI) is increasingly being utilized in healthcare. This article provides clinicians and researchers with a step-wise foundation for high-value AI that can be applied to a variety of different data modalities. The aim is to improve the transparency and application of AI methods, with the potential to benefit patients in routine cardiovascular care. Following a clear research hypothesis, an AI-based workflow begins with data selection and pre-processing prior to analysis, with the type of data (structured, semi-structured, or unstructured) determining what type of pre-processing steps and machine-learning algorithms are required. Algorithmic and data validation should be performed to ensure the robustness of the chosen methodology, followed by an objective evaluation of performance. Seven case studies are provided to highlight the wide variety of data modalities and clinical questions that can benefit from modern AI techniques, with a focus on applying them to cardiovascular disease management. Despite the growing use of AI, further education for healthcare workers, researchers, and the public are needed to aid understanding of how AI works and to close the existing gap in knowledge. In addition, issues regarding data access, sharing, and security must be addressed to ensure full engagement by patients and the public. The application of AI within healthcare provides an opportunity for clinicians to deliver a more personalized approach to medical care by accounting for confounders, interactions, and the rising prevalence of multi-morbidity.

Subject(s)

Artificial Intelligence , Cardiovascular System , Humans , Algorithms , Machine Learning , Delivery of Health Care

The risk profile of patients with COVID-19 as predictors of lung lesions severity and mortality-Development and validation of a prediction model.

Rahimi, Ezat; Shahisavandi, Mina; Royo, Albert Cid; Azizi, Mohammad; El Bouhaddani, Said; Sigari, Naseh; Sturkenboom, Miriam; Ahmadizar, Fariba.

Front Microbiol ; 13: 893750, 2022.

Article in English | MEDLINE | ID: mdl-35958125

ABSTRACT

Objective: We developed and validated a prediction model based on individuals' risk profiles to predict the severity of lung involvement and death in patients hospitalized with coronavirus disease 2019 (COVID-19) infection. Methods: In this retrospective study, we studied hospitalized COVID-19 patients with data on chest CT scans performed during hospital stay (February 2020-April 2021) in a training dataset (TD) (n = 2,251) and an external validation dataset (eVD) (n = 993). We used the most relevant demographical, clinical, and laboratory variables (n = 25) as potential predictors of COVID-19-related outcomes. The primary and secondary endpoints were the severity of lung involvement quantified as mild (≤25%), moderate (26-50%), severe (>50%), and in-hospital death, respectively. We applied random forest (RF) classifier, a machine learning technique, and multivariable logistic regression analysis to study our objectives. Results: In the TD and the eVD, respectively, the mean [standard deviation (SD)] age was 57.9 (18.0) and 52.4 (17.6) years; patients with severe lung involvement [n (%):185 (8.2) and 116 (11.7)] were significantly older [mean (SD) age: 64.2 (16.9), and 56.2 (18.9)] than the other two groups (mild and moderate). The mortality rate was higher in patients with severe (64.9 and 38.8%) compared to moderate (5.5 and 12.4%) and mild (2.3 and 7.1%) lung involvement. The RF analysis showed age, C reactive protein (CRP) levels, and duration of hospitalizations as the three most important predictors of lung involvement severity at the time of the first CT examination. Multivariable logistic regression analysis showed a significant strong association between the extent of the severity of lung involvement (continuous variable) and death; adjusted odds ratio (OR): 9.3; 95% CI: 7.1-12.1 in the TD and 2.6 (1.8-3.5) in the eVD. Conclusion: In hospitalized patients with COVID-19, the severity of lung involvement is a strong predictor of death. Age, CRP levels, and duration of hospitalizations are the most important predictors of severe lung involvement. A simple prediction model based on available clinical and imaging data provides a validated tool that predicts the severity of lung involvement and death probability among hospitalized patients with COVID-19.

Statistical integration of two omics datasets using GO2PLS.

Gu, Zhujie; El Bouhaddani, Said; Pei, Jiayi; Houwing-Duistermaat, Jeanine; Uh, Hae-Won.

BMC Bioinformatics ; 22(1): 131, 2021 Mar 18.

Article in English | MEDLINE | ID: mdl-33736604

ABSTRACT

BACKGROUND: Nowadays, multiple omics data are measured on the same samples in the belief that these different omics datasets represent various aspects of the underlying biological systems. Integrating these omics datasets will facilitate the understanding of the systems. For this purpose, various methods have been proposed, such as Partial Least Squares (PLS), decomposing two datasets into joint and residual subspaces. Since omics data are heterogeneous, the joint components in PLS will contain variation specific to each dataset. To account for this, Two-way Orthogonal Partial Least Squares (O2PLS) captures the heterogeneity by introducing orthogonal subspaces and better estimates the joint subspaces. However, the latent components spanning the joint subspaces in O2PLS are linear combinations of all variables, while it might be of interest to identify a small subset relevant to the research question. To obtain sparsity, we extend O2PLS to Group Sparse O2PLS (GO2PLS) that utilizes biological information on group structures among variables and performs group selection in the joint subspace. RESULTS: The simulation study showed that introducing sparsity improved the feature selection performance. Furthermore, incorporating group structures increased robustness of the feature selection procedure. GO2PLS performed optimally in terms of accuracy of joint score estimation, joint loading estimation, and feature selection. We applied GO2PLS to datasets from two studies: TwinsUK (a population study) and CVON-DOSIS (a small case-control study). In the first, we incorporated biological information on the group structures of the methylation CpG sites when integrating the methylation dataset with the IgG glycomics data. The targeted genes of the selected methylation groups turned out to be relevant to the immune system, in which the IgG glycans play important roles. In the second, we selected regulatory regions and transcripts that explained the covariance between regulomics and transcriptomics data. The corresponding genes of the selected features appeared to be relevant to heart muscle disease. CONCLUSIONS: GO2PLS integrates two omics datasets to help understand the underlying system that involves both omics levels. It incorporates external group information and performs group selection, resulting in a small subset of features that best explain the relationship between two omics datasets for better interpretability.

Subject(s)

Computational Biology , Genomics , Case-Control Studies , Least-Squares Analysis

Investigating the impact of Down syndrome on methylation and glycomics with two-stage PO2PLS.

Gu, Zhujie; El Bouhaddani, Said; Houwing-Duistermaat, Jeanine; Uh, Hae-Won.

Theor Biol Forum ; 114(1-2): 29-44, 2021 Jan 01.

Article in English | MEDLINE | ID: mdl-35502729

ABSTRACT

Down syndrome (DS) is a condition that leads to precocious and accelerated aging in affected subjects. Several alterations in DS cases have been reported at a molecular level, particularly in methylation and glycosylation. Investigating the relation between methylation, glycomics and DS can lead to new insights underlying the atypical aging. We consider a data integration approach, where we investigate how DS affects the parts of glycomics and methylation which are correlated, and which CpG sites and glycans are relevant. Our motivating datasets consist of methylation and glycomics data, measured on 29 DS patients and their unaffected siblings and mothers. The family-based case-control design needs to be taken into account when studying the relationship between methylation, glycomics and DS. We propose a two-stage approach to first integrate methylation and glycomics data, and then link the joint information to Down syndrome. For the data integration step, we consider probabilistic two-way orthogonal partial least squares (PO2PLS). PO2PLS models two omics datasets in terms of low-dimensional joint and omic-specific latent components, and takes into account heterogeneity across the omics data. The relationship between the omics data can be statistically tested. The joint components represent the joint information in methylation and glycomics. In the second stage, we apply a linear mixed model to the relationship between DS and the joint methylation and glycomics components. For the components that are significantly as sociated with DS, we identify the most important CpG sites and glycans. A simulation study is conducted to evaluate the performance of our approach. The results showed that the effects of DS on the omics data can be detected in a large sample size, and the accuracy of the feature selection was high in both small and large sample sizes. Our approach is applied to the DS datasets, a significant effect of DS on the joint components is found. The identified CpG sites and glycans appeared to be related to DS. Our proposed method that jointly analyzes multiple omics data with an outcome variable may provide new insight into the molecular implications of DS at different omics levels.

Subject(s)

Down Syndrome , Glycomics , DNA Methylation , Down Syndrome/genetics , Female , Glycomics/methods , Humans , Polysaccharides , Protein Processing, Post-Translational

Estimation of the effect of surrogate multi-omic biomarkers.

Fuady, Angga M; El Bouhaddani, Said; Uh, Hae-Won; Houwing-Duistermaat, Jeanine.

Theor Biol Forum ; 114(1-2): 59-73, 2021 Jan 01.

Article in English | MEDLINE | ID: mdl-35502731

ABSTRACT

Multiple technologies which measure the same omics data set but are based on different aspects of the molecules exist. In practice, studies use different technologies and have therefore different biomarkers. An example is the glycan age index, which is constructed by three different ultra-performance liquid chromatography (UPLC) IgG glycans, and is a biomarker for biological age. A second technology is liquid chromatography- mass spectrometry (LCMS). To estimate the effect of a biomarker on an outcome variable, two issues need to be addressed. Firstly, a measurement error is needed to map one technology to the other one using a calibration study. Here, we consider two approaches, namely one based on the chemical properties of the two technologies and one based on the estimation of this relationship using O2PLS. Secondly, the use of an approximation of the biomarker in the main study needs to be taken into account by use of a regression calibration method. The performance of the two approaches is studied via simulations. The methods are used to estimate the relationship between glycan age and menopause. We have data from two cohorts, namely Korcula and Vis. In conclusion, (1) both measurement error models give similar results and suggest that there is an association between the glycan age index and the menopause status, (2) the chemical mapping approach outperforms O2PLS in the low measurement error variance, while on the larger measurement error variance, O2PLS works better, (3) statistical efficiency is lost due to increased noise level by adding irrelevant information.

Subject(s)

Polysaccharides , Biomarkers , Calibration , Female , Humans , Mass Spectrometry/methods , Regression Analysis

Human Plasma N-glycosylation as Analyzed by Matrix-Assisted Laser Desorption/Ionization-Fourier Transform Ion Cyclotron Resonance-MS Associates with Markers of Inflammation and Metabolic Health.

Reiding, Karli R; Ruhaak, L Renee; Uh, Hae-Won; El Bouhaddani, Said; van den Akker, Erik B; Plomp, Rosina; McDonnell, Liam A; Houwing-Duistermaat, Jeanine J; Slagboom, P Eline; Beekman, Marian; Wuhrer, Manfred.

Mol Cell Proteomics ; 16(2): 228-242, 2017 02.

Article in English | MEDLINE | ID: mdl-27932526

ABSTRACT

Glycosylation is an abundant co- and post-translational protein modification of importance to protein processing and activity. Although not template-defined, glycosylation does reflect the biological state of an organism and is a high-potential biomarker for disease and patient stratification. However, to interpret a complex but informative sample like the total plasma N-glycome, it is important to establish its baseline association with plasma protein levels and systemic processes. Thus far, large-scale studies (n >200) of the total plasma N-glycome have been performed with methods of chromatographic and electrophoretic separation, which, although being informative, are limited in resolving the structural complexity of plasma N-glycans. MS has the opportunity to contribute additional information on, among others, antennarity, sialylation, and the identity of high-mannose type species.Here, we have used matrix-assisted laser desorption/ionization (MALDI)-Fourier transform ion cyclotron resonance (FTICR)-MS to study the total plasma N-glycome of 2144 healthy middle-aged individuals from the Leiden Longevity Study, to allow association analysis with markers of metabolic health and inflammation. To achieve this, N-glycans were enzymatically released from their protein backbones, labeled at the reducing end with 2-aminobenzoic acid, and following purification analyzed by negative ion mode intermediate pressure MALDI-FTICR-MS. In doing so, we achieved the relative quantification of 61 glycan compositions, ranging from Hex4HexNAc2 to Hex7HexNAc6dHex1Neu5Ac4, as well as that of 39 glycosylation traits derived thereof. Next to confirming known associations of glycosylation with age and sex by MALDI-FTICR-MS, we report novel associations with C-reactive protein (CRP), interleukin 6 (IL-6), body mass index (BMI), leptin, adiponectin, HDL cholesterol, triglycerides (TG), insulin, gamma-glutamyl transferase (GGT), alanine aminotransferase (ALT), and smoking. Overall, the bisection, galactosylation, and sialylation of diantennary species, the sialylation of tetraantennary species, and the size of high-mannose species proved to be important plasma characteristics associated with inflammation and metabolic health.

Subject(s)

Biomarkers/blood , Inflammation/metabolism , Proteomics/instrumentation , Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization/instrumentation , Aged , Body Mass Index , C-Reactive Protein/metabolism , Cyclotrons , Fourier Analysis , Glycosylation , Humans , Male , Middle Aged

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL