Search | VHL Regional Portal

1.

iDNA-OpenPrompt: OpenPrompt learning model for identifying DNA methylation.

Yu, Xia; Ren, Jia; Long, Haixia; Zeng, Rao; Zhang, Guoqiang; Bilal, Anas; Cui, Yani.

Front Genet ; 15: 1377285, 2024.

Article in English | MEDLINE | ID: mdl-38689652

ABSTRACT

Introduction: DNA methylation is a critical epigenetic modification involving the addition of a methyl group to the DNA molecule, playing a key role in regulating gene expression without changing the DNA sequence. The main difficulty in identifying DNA methylation sites lies in the subtle and complex nature of methylation patterns, which may vary across different tissues, developmental stages, and environmental conditions. Traditional methods for methylation site identification, such as bisulfite sequencing, are typically labor-intensive, costly, and require large amounts of DNA, hindering high-throughput analysis. Moreover, these methods may not always provide the resolution needed to detect methylation at specific sites, especially in genomic regions that are rich in repetitive sequences or have low levels of methylation. Furthermore, current deep learning approaches generally lack sufficient accuracy. Methods: This study introduces the iDNA-OpenPrompt model, leveraging the novel OpenPrompt learning framework. The model combines a prompt template, prompt verbalizer, and Pre-trained Language Model (PLM) to construct the prompt-learning framework for DNA methylation sequences. Moreover, a DNA vocabulary library, BERT tokenizer, and specific label words are also introduced into the model to enable accurate identification of DNA methylation sites. Results and Discussion: An extensive analysis is conducted to evaluate the predictive, reliability, and consistency capabilities of the iDNA-OpenPrompt model. The experimental outcomes, covering 17 benchmark datasets that include various species and three DNA methylation modifications (4mC, 5hmC, 6mA), consistently indicate that our model surpasses outstanding performance and robustness approaches.

2.

DRSN4mCPred: accurately predicting sites of DNA N4-methylcytosine using deep residual shrinkage network for diagnosis and treatment of gastrointestinal cancer in the precision medicine era.

Yu, Xia; Ren, Jia; Cui, Yani; Zeng, Rao; Long, Haixia; Ma, Cuihua.

Front Med (Lausanne) ; 10: 1187430, 2023.

Article in English | MEDLINE | ID: mdl-37215722

ABSTRACT

Introduction: The DNA N4-methylcytosine (4mC) site levels of those suffering from digestive system cancers were higher, and the pathogenesis of digestive system cancers may also be related to the changes in DNA 4mC levels. Identifying DNA 4mC sites is a very important step in studying the analysis of biological function and cancer prediction. Extracting accurate features from DNA sequences is the key to establishing a prediction model of effective DNA 4mC sites. This study sought to develop a new predictive model, DRSN4mCPred, which aimed to improve the performance of the predicting DNA 4mC sites. Methods: The model adopted multi-scale channel attention to extract features and used attention feature fusion (AFF) to fuse features. In order to capture features information more accurately and effectively, this model utilized Deep Residual Shrinkage Network with Channel-Wise thresholds (DRSN-CW) to eliminate noise-related features and achieve a more precise feature representation, thereby, distinguishing the sites in DNA with 4mC and non-4mC. Additionally, the predictive model incorporated an inverted residual block, a Multi-scale Channel Attention Module (MS-CAM), a Bi-directional Long Short Term Memory Network (Bi-LSTM), AFF, and DRSN-CW. Results and Discussion: The results indicated the predictive model DRSN4mCPred had extremely good performance in predicting the DNA 4mC sites across different species. This paper will potentially provide support for the diagnosis and treatment of gastrointestinal cancer based on artificial intelligence in the precise medical era.

3.

IGWO-IVNet3: DL-Based Automatic Diagnosis of Lung Nodules Using an Improved Gray Wolf Optimization and InceptionNet-V3.

Bilal, Anas; Shafiq, Muhammad; Fang, Fang; Waqar, Muhammad; Ullah, Inam; Ghadi, Yazeed Yasin; Long, Haixia; Zeng, Rao.

Sensors (Basel) ; 22(24)2022 Dec 07.

Article in English | MEDLINE | ID: mdl-36559970

ABSTRACT

Artificial intelligence plays an essential role in diagnosing lung cancer. Lung cancer is notoriously difficult to diagnose until it has progressed to a late stage, making it a leading cause of cancer-related mortality. Lung cancer is fatal if not treated early, making this a significant issue. Initial diagnosis of malignant nodules is often made using chest radiography (X-ray) and computed tomography (CT) scans; nevertheless, the possibility of benign nodules leads to wrong choices. In their first phases, benign and malignant nodules seem very similar. Additionally, radiologists have a hard time viewing and categorizing lung abnormalities. Lung cancer screenings performed by radiologists are often performed with the use of computer-aided diagnostic technologies. Computer scientists have presented many methods for identifying lung cancer in recent years. Low-quality images compromise the segmentation process, rendering traditional lung cancer prediction algorithms inaccurate. This article suggests a highly effective strategy for identifying and categorizing lung cancer. Noise in the pictures was reduced using a weighted filter, and the improved Gray Wolf Optimization method was performed before segmentation with watershed modification and dilation operations. We used InceptionNet-V3 to classify lung cancer into three groups, and it performed well compared to prior studies: 98.96% accuracy, 94.74% specificity, as well as 100% sensitivity.

Subject(s)

Lung Neoplasms , Solitary Pulmonary Nodule , Humans , Artificial Intelligence , Solitary Pulmonary Nodule/diagnostic imaging , Tomography, X-Ray Computed/methods , Lung Neoplasms/diagnostic imaging , Lung Neoplasms/pathology , Algorithms , Diagnosis, Computer-Assisted/methods , Lung/pathology , Radiographic Image Interpretation, Computer-Assisted/methods , Sensitivity and Specificity

4.

Single-cell RNA-seq data analysis using graph autoencoders and graph attention networks.

Feng, Xiang; Fang, Fang; Long, Haixia; Zeng, Rao; Yao, Yuhua.

Front Genet ; 13: 1003711, 2022.

Article in English | MEDLINE | ID: mdl-36568390

ABSTRACT

With the development of high-throughput sequencing technology, the scale of single-cell RNA sequencing (scRNA-seq) data has surged. Its data are typically high-dimensional, with high dropout noise and high sparsity. Therefore, gene imputation and cell clustering analysis of scRNA-seq data is increasingly important. Statistical or traditional machine learning methods are inefficient, and improved accuracy is needed. The methods based on deep learning cannot directly process non-Euclidean spatial data, such as cell diagrams. In this study, we developed scGAEGAT, a multi-modal model with graph autoencoders and graph attention networks for scRNA-seq analysis based on graph neural networks. Cosine similarity, median L1 distance, and root-mean-squared error were used to measure the gene imputation performance of different methods for comparison with scGAEGAT. Furthermore, adjusted mutual information, normalized mutual information, completeness score, and Silhouette coefficient score were used to measure the cell clustering performance of different methods for comparison with scGAEGAT. Experimental results demonstrated promising performance of the scGAEGAT model in gene imputation and cell clustering prediction on four scRNA-seq data sets with gold-standard cell labels.

5.

iDNA-ABT: advanced deep learning model for detecting DNA methylation with adaptive features and transductive information maximization.

Yu, Yingying; He, Wenjia; Jin, Junru; Xiao, Guobao; Cui, Lizhen; Zeng, Rao; Wei, Leyi.

Bioinformatics ; 37(24): 4603-4610, 2021 12 11.

Article in English | MEDLINE | ID: mdl-34601568

ABSTRACT

MOTIVATION: DNA methylation plays an important role in epigenetic modification, the occurrence, and the development of diseases. Therefore, identification of DNA methylation sites is critical for better understanding and revealing their functional mechanisms. To date, several machine learning and deep learning methods have been developed for the prediction of different DNA methylation types. However, they still highly rely on manual features, which can largely limit the high-latent information extraction. Moreover, most of them are designed for one specific DNA methylation type, and therefore cannot predict multiple methylation sites in multiple species simultaneously. In this study, we propose iDNA-ABT, an advanced deep learning model that utilizes adaptive embedding based on Bidirectional Encoder Representations from Transformers (BERT) together with transductive information maximization (TIM). RESULTS: Benchmark results show that our proposed iDNA-ABT can automatically and adaptively learn the distinguishing features of biological sequences from multiple species, and thus perform significantly better than the state-of-the-art methods in predicting three different DNA methylation types. In addition, TIM loss is proven to be effective in dichotomous tasks via the comparison experiment. Furthermore, we verify that our features have strong adaptability and robustness to different species through comparison of adaptive embedding and six handcrafted feature encodings. Importantly, our model shows great generalization ability in different species, demonstrating that our model can adaptively capture the cross-species differences and improve the predictive performance. For the convenient use of our method, we further established an online webserver as the implementation of the proposed iDNA-ABT. AVAILABILITY AND IMPLEMENTATION: Our proposed iDNA-ABT and data are freely accessible via http://server.wei-group.net/iDNA_ABT and our source codes are available for downloading in the GitHub repository (https://github.com/YUYING07/iDNA_ABT). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

DNA Methylation , Deep Learning , Software , Machine Learning , Epigenesis, Genetic

6.

4mCPred-MTL: Accurate Identification of DNA 4mC Sites in Multiple Species Using Multi-Task Deep Learning Based on Multi-Head Attention Mechanism.

Zeng, Rao; Cheng, Song; Liao, Minghong.

Front Cell Dev Biol ; 9: 664669, 2021.

Article in English | MEDLINE | ID: mdl-34041243

ABSTRACT

DNA methylation is one of the most extensive epigenetic modifications. DNA 4mC modification plays a key role in regulating chromatin structure and gene expression. In this study, we proposed a generic 4mC computational predictor, namely, 4mCPred-MTL using multi-task learning coupled with Transformer to predict 4mC sites in multiple species. In this predictor, we utilize a multi-task learning framework, in which each task is to train species-specific data based on Transformer. Extensive experimental results show that our multi-task predictive model can significantly improve the performance of the model based on single task and outperform existing methods on benchmarking comparison. Moreover, we found that our model can sufficiently capture better characteristics of 4mC sites as compared to existing commonly used feature descriptors, demonstrating the strong feature learning ability of our model. Therefore, based on the above results, it can be expected that our 4mCPred-MTL can be a useful tool for research communities of interest.

7.

Developing a Multi-Layer Deep Learning Based Predictive Model to Identify DNA N4-Methylcytosine Modifications.

Zeng, Rao; Liao, Minghong.

Front Bioeng Biotechnol ; 8: 274, 2020.

Article in English | MEDLINE | ID: mdl-32373597

ABSTRACT

DNA N4-methylcytosine modification (4mC) plays an essential role in a variety of biological processes. Therefore, accurate identification the 4mC distribution in genome-scale is important for systematically understanding its biological functions. In this study, we present Deep4mcPred, a multi-layer deep learning based predictive model to identify DNA N4-methylcytosine modifications. In this predictor, we for the first time integrate residual network and recurrent neural network to build a multi-layer deep learning predictive system. As compared to existing predictors using traditional machine learning, our proposed method has two advantages. First, our deep learning framework does not need to specify the features when training the predictive model. It can automatically learn the high-level features and capture the characteristic specificity of 4mC sites, benefiting to distinguish true 4mC sites from non-4mC sites. On the other hand, our deep learning method outperforms the traditional machine learning predictors in performance by benchmarking comparison, demonstrating that the proposed Deep4mcPred is more effective in the DNA 4mC site prediction. Moreover, via experimental comparison, we found that attention mechanism introduced into the deep learning framework is useful to capture the critical features. Additionally, we develop a webserver implementing the proposed method for the academic use of research community, which is now available at http://server.malab.cn/Deep4mcPred.

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL