Your browser doesn't support javascript.
loading
Deep5hmC: predicting genome-wide 5-hydroxymethylcytosine landscape via a multimodal deep learning model.
Ma, Xin; Thela, Sai Ritesh; Zhao, Fengdi; Yao, Bing; Wen, Zhexing; Jin, Peng; Zhao, Jinying; Chen, Li.
Affiliation
  • Ma X; Department of Biostatistics, University of Florida, Gainesville, FL 32603, United States.
  • Thela SR; Department of Biostatistics, University of Florida, Gainesville, FL 32603, United States.
  • Zhao F; Department of Biostatistics, University of Florida, Gainesville, FL 32603, United States.
  • Yao B; Department of Human Genetics, Emory University School of Medicine, Atlanta, GA 30322, United States.
  • Wen Z; Department of Psychiatry and Behavioral Sciences, Emory University School of Medicine, Atlanta, GA 30322, United States.
  • Jin P; Department of Human Genetics, Emory University School of Medicine, Atlanta, GA 30322, United States.
  • Zhao J; Department of Epidemiology, University of Florida, Gainesville, FL 32603, United States.
  • Chen L; Department of Biostatistics, University of Florida, Gainesville, FL 32603, United States.
Bioinformatics ; 40(9)2024 09 02.
Article in En | MEDLINE | ID: mdl-39196755
ABSTRACT
MOTIVATION 5-Hydroxymethylcytosine (5hmC), a crucial epigenetic mark with a significant role in regulating tissue-specific gene expression, is essential for understanding the dynamic functions of the human genome. Despite its importance, predicting 5hmC modification across the genome remains a challenging task, especially when considering the complex interplay between DNA sequences and various epigenetic factors such as histone modifications and chromatin accessibility.

RESULTS:

Using tissue-specific 5hmC sequencing data, we introduce Deep5hmC, a multimodal deep learning framework that integrates both the DNA sequence and epigenetic features such as histone modification and chromatin accessibility to predict genome-wide 5hmC modification. The multimodal design of Deep5hmC demonstrates remarkable improvement in predicting both qualitative and quantitative 5hmC modification compared to unimodal versions of Deep5hmC and state-of-the-art machine learning methods. This improvement is demonstrated through benchmarking on a comprehensive set of 5hmC sequencing data collected at four developmental stages during forebrain organoid development and across 17 human tissues. Compared to DeepSEA and random forest, Deep5hmC achieves close to 4% and 17% improvement of Area Under the Receiver Operating Characteristic (AUROC) across four forebrain developmental stages, and 6% and 27% across 17 human tissues for predicting binary 5hmC modification sites; and 8% and 22% improvement of Spearman correlation coefficient across four forebrain developmental stages, and 17% and 30% across 17 human tissues for predicting continuous 5hmC modification. Notably, Deep5hmC showcases its practical utility by accurately predicting gene expression and identifying differentially hydroxymethylated regions (DhMRs) in a case-control study of Alzheimer's disease (AD). Deep5hmC significantly improves our understanding of tissue-specific gene regulation and facilitates the development of new biomarkers for complex diseases. AVAILABILITY AND IMPLEMENTATION Deep5hmC is available via https//github.com/lichen-lab/Deep5hmC.
Subject(s)

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: 5-Methylcytosine / Deep Learning Limits: Humans Language: En Journal: Bioinformatics Journal subject: INFORMATICA MEDICA Year: 2024 Document type: Article Affiliation country: United States Country of publication: United kingdom

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: 5-Methylcytosine / Deep Learning Limits: Humans Language: En Journal: Bioinformatics Journal subject: INFORMATICA MEDICA Year: 2024 Document type: Article Affiliation country: United States Country of publication: United kingdom