Search | VHL Regional Portal

DGDTA: dynamic graph attention network for predicting drug-target binding affinity.

Zhai, Haixia; Hou, Hongli; Luo, Junwei; Liu, Xiaoyan; Wu, Zhengjiang; Wang, Junfeng.

BMC Bioinformatics ; 24(1): 367, 2023 Sep 30.

Article in English | MEDLINE | ID: mdl-37777712

ABSTRACT

BACKGROUND: Obtaining accurate drug-target binding affinity (DTA) information is significant for drug discovery and drug repositioning. Although some methods have been proposed for predicting DTA, the features of proteins and drugs still need to be further analyzed. Recently, deep learning has been successfully used in many fields. Hence, designing a more effective deep learning method for predicting DTA remains attractive. RESULTS: Dynamic graph DTA (DGDTA), which uses a dynamic graph attention network combined with a bidirectional long short-term memory (Bi-LSTM) network to predict DTA is proposed in this paper. DGDTA adopts drug compound as input according to its corresponding simplified molecular input line entry system (SMILES) and protein amino acid sequence. First, each drug is considered a graph of interactions between atoms and edges, and dynamic attention scores are used to consider which atoms and edges in the drug are most important for predicting DTA. Then, Bi-LSTM is used to better extract the contextual information features of protein amino acid sequences. Finally, after combining the obtained drug and protein feature vectors, the DTA is predicted by a fully connected layer. The source code is available from GitHub at https://github.com/luojunwei/DGDTA . CONCLUSIONS: The experimental results show that DGDTA can predict DTA more accurately than some other methods.

Subject(s)

Drug Delivery Systems , Drug Discovery , Amino Acid Sequence , Drug Repositioning , Protein Domains

Deep learning approach for cancer subtype classification using high-dimensional gene expression data.

Shen, Jiquan; Shi, Jiawei; Luo, Junwei; Zhai, Haixia; Liu, Xiaoyan; Wu, Zhengjiang; Yan, Chaokun; Luo, Huimin.

BMC Bioinformatics ; 23(1): 430, 2022 Oct 17.

Article in English | MEDLINE | ID: mdl-36253710

ABSTRACT

MOTIVATION: Studies have shown that classifying cancer subtypes can provide valuable information for a range of cancer research, from aetiology and tumour biology to prognosis and personalized treatment. Current methods usually adopt gene expression data to perform cancer subtype classification. However, cancer samples are scarce, and the high-dimensional features of their gene expression data are too sparse to allow most methods to achieve desirable classification results. RESULTS: In this paper, we propose a deep learning approach by combining a convolutional neural network (CNN) and bidirectional gated recurrent unit (BiGRU): our approach, DCGN, aims to achieve nonlinear dimensionality reduction and learn features to eliminate irrelevant factors in gene expression data. Specifically, DCGN first uses the synthetic minority oversampling technique algorithm to equalize data. The CNN can handle high-dimensional data without stress and extract important local features, and the BiGRU can analyse deep features and retain their important information; the DCGN captures key features by combining both neural networks to overcome the challenges of small sample sizes and sparse, high-dimensional features. In the experiments, we compared the DCGN to seven other cancer subtype classification methods using breast and bladder cancer gene expression datasets. The experimental results show that the DCGN performs better than the other seven methods and can provide more satisfactory classification results.

Subject(s)

Deep Learning , Neoplasms , Algorithms , Gene Expression , Neoplasms/genetics , Neural Networks, Computer

BreakNet: detecting deletions using long reads and a deep learning approach.

Luo, Junwei; Ding, Hongyu; Shen, Jiquan; Zhai, Haixia; Wu, Zhengjiang; Yan, Chaokun; Luo, Huimin.

BMC Bioinformatics ; 22(1): 577, 2021 Dec 02.

Article in English | MEDLINE | ID: mdl-34856923

ABSTRACT

BACKGROUND: Structural variations (SVs) occupy a prominent position in human genetic diversity, and deletions form an important type of SV that has been suggested to be associated with genetic diseases. Although various deletion calling methods based on long reads have been proposed, a new approach is still needed to mine features in long-read alignment information. Recently, deep learning has attracted much attention in genome analysis, and it is a promising technique for calling SVs. RESULTS: In this paper, we propose BreakNet, a deep learning method that detects deletions by using long reads. BreakNet first extracts feature matrices from long-read alignments. Second, it uses a time-distributed convolutional neural network (CNN) to integrate and map the feature matrices to feature vectors. Third, BreakNet employs a bidirectional long short-term memory (BLSTM) model to analyse the produced set of continuous feature vectors in both the forward and backward directions. Finally, a classification module determines whether a region refers to a deletion. On real long-read sequencing datasets, we demonstrate that BreakNet outperforms Sniffles, SVIM and cuteSV in terms of their F1 scores. The source code for the proposed method is available from GitHub at https://github.com/luojunwei/BreakNet . CONCLUSIONS: Our work shows that deep learning can be combined with long reads to call deletions more effectively than existing methods.

Subject(s)

Deep Learning , Genome , High-Throughput Nucleotide Sequencing , Humans , Sequence Analysis, DNA , Software

A comprehensive review of scaffolding methods in genome assembly.

Luo, Junwei; Wei, Yawei; Lyu, Mengna; Wu, Zhengjiang; Liu, Xiaoyan; Luo, Huimin; Yan, Chaokun.

Brief Bioinform ; 22(5)2021 09 02.

Article in English | MEDLINE | ID: mdl-33634311

ABSTRACT

In the field of genome assembly, scaffolding methods make it possible to obtain a more complete and contiguous reference genome, which is the cornerstone of genomic research. Scaffolding methods typically utilize the alignments between contigs and sequencing data (reads) to determine the orientation and order among contigs and to produce longer scaffolds, which are helpful for genomic downstream analysis. With the rapid development of high-throughput sequencing technologies, diverse types of reads have emerged over the past decade, especially in long-range sequencing, which have greatly enhanced the assembly quality of scaffolding methods. As the number of scaffolding methods increases, biology and bioinformatics researchers need to perform in-depth analyses of state-of-the-art scaffolding methods. In this article, we focus on the difficulties in scaffolding, the differences in characteristics among various kinds of reads, the methods by which current scaffolding methods address these difficulties, and future research opportunities. We hope this work will benefit the design of new scaffolding methods and the selection of appropriate scaffolding methods for specific biological studies.

Subject(s)

Computational Biology/methods , Contig Mapping/methods , Genome , Software , Animals , High-Throughput Nucleotide Sequencing , Humans , Sequence Analysis, DNA

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL