Search | VHL Regional Portal

An interactive atlas of genomic, proteomic, and metabolomic biomarkers promotes the potential of proteins to predict complex diseases.

Smelik, Martin; Zhao, Yelin; Li, Xinxiu; Loscalzo, Joseph; Sysoev, Oleg; Mahmud, Firoj; Mansour Aly, Dina; Benson, Mikael.

Sci Rep ; 14(1): 12710, 2024 06 03.

Article in English | MEDLINE | ID: mdl-38830935

ABSTRACT

Multiomics analyses have identified multiple potential biomarkers of the incidence and prevalence of complex diseases. However, it is not known which type of biomarker is optimal for clinical purposes. Here, we make a systematic comparison of 90 million genetic variants, 1453 proteins, and 325 metabolites from 500,000 individuals with complex diseases from the UK Biobank. A machine learning pipeline consisting of data cleaning, data imputation, feature selection, and model training using cross-validation and comparison of the results on holdout test sets showed that proteins were most predictive, followed by metabolites, and genetic variants. Only five proteins per disease resulted in median (min-max) areas under the receiver operating characteristic curves for incidence of 0.79 (0.65-0.86) and 0.84 (0.70-0.91) for prevalence. In summary, our work suggests the potential of predicting complex diseases based on a limited number of proteins. We provide an interactive atlas (macd.shinyapps.io/ShinyApp/) to find genomic, proteomic, or metabolomic biomarkers for different complex diseases.

Subject(s)

Biomarkers , Genomics , Metabolomics , Proteomics , Humans , Biomarkers/metabolism , Proteomics/methods , Metabolomics/methods , Genomics/methods , Machine Learning

An interactive atlas of genomic, proteomic, and metabolomic biomarkers promotes the potential of proteins to predict complex diseases.

Benson, Mikael; Smelik, Martin; Li, Xinxiu; Loscalzo, Joseph; Sysoev, Oleg; Mahmud, Firoj; Aly, Dina Mansour; Zhao, Yelin.

Res Sq ; 2024 Mar 05.

Article in English | MEDLINE | ID: mdl-38496611

ABSTRACT

Multiomics analyses have identified multiple potential biomarkers of the incidence and prevalence of complex diseases. However, it is not known which type of biomarker is optimal for clinical purposes. Here, we make a systematic comparison of 90 million genetic variants, 1,453 proteins, and 325 metabolites from 500,000 individuals with complex diseases from the UK Biobank. A machine learning pipeline consisting of data cleaning, data imputation, feature selection, and model training using cross-validation and comparison of the results on holdout test sets showed that proteins were most predictive, followed by metabolites, and genetic variants. Only five proteins per disease resulted in median (min-max) areas under the receiver operating characteristic curves for incidence of 0.79 (0.65-0.86) and 0.84 (0.70-0.91) for prevalence. In summary, our work suggests the potential of predicting complex diseases based on a limited number of proteins. We provide an interactive atlas (macd.shinyapps.io/ShinyApp/) to find genomic, proteomic, or metabolomic biomarkers for different complex diseases.

ABSTRACT

Subject(s)

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL