Search | VHL Regional Portal

MiniDBG: A Novel and Minimal De Bruijn Graph for Read Mapping.

Yu, Changyong; Zhao, Yuhai; Zhao, Chu; Jin, Jianyu; Mao, Keming; Wang, Guoren.

IEEE/ACM Trans Comput Biol Bioinform ; 21(1): 129-142, 2024.

Article in English | MEDLINE | ID: mdl-38060353

ABSTRACT

The De Bruijn graph (DBG) has been widely used in the algorithms for indexing or organizing read and reference sequences in bioinformatics. However, a DBG model that can locate each node, edge and path on sequence has not been proposed so far. Recently, DBG has been used for representing reference sequences in read mapping tasks. In this process, it is not a one-to-one correspondence between the paths of DBG and the substrings of reference sequence. This results in the false path on DBG, which means no substrings of reference producing the path. Moreover, if a candidate path of a read is true, we need to locate it and verify the candidate on sequence. To solve these problems, we proposed a DBG model, called MiniDBG, which stores the position lists of a minimal set of edges. With the position lists, MiniDBG can locate any node, edge and path efficiently. We also proposed algorithms for generating MiniDBG based on an original DBG and algorithms for locating edges or paths on sequence. We designed and ran experiments on real datasets for comparing them with BWT-based and position list-based methods. The experimental results show that MiniDBG can locate the edges and paths efficiently with lower memory costs.

Subject(s)

Algorithms , Computational Biology , Sequence Analysis, DNA/methods , Computational Biology/methods , Software , High-Throughput Nucleotide Sequencing/methods

A novel open-source CADs platform for 3D CT pulmonary analysis.

Mao, Keming; Jing, Xin; Wang, Gao; Chang, Yachen; Liu, Jiale; Zhao, Yuhai; Yu, Shiyu; Liu, Jingyu.

Comput Biol Med ; 169: 107878, 2024 Feb.

Article in English | MEDLINE | ID: mdl-38141446

ABSTRACT

Computer-aided diagnosis (CAD) systems play vital roles in the early detection of pulmonary nodules for reducing lung cancer mortality rates. To provide better services for professional doctors, this paper proposes an efficient open-source CAD platform with flexible equipments, user-friendly interfaces, and completed functions for 3D CT pulmonary nodule analysis. For the platform's design and implementation, we fully consider application scenarios and system requirements. The platform supplies core functions for (1) Basic Image Processing, (2) Intelligent Image Analysis, (3) Multi-View Image Visualization, (4) Report Editing and Generation, (5) User Information Management, and (6) Inference Service Monitoring. Specifically, other state-of-the-art or user-defined algorithms can be integrated as plugin modules with no interference for system architecture. System evaluation with use-case testing demonstrates the effectiveness and universality of the proposed platform.

Subject(s)

Lung Neoplasms , Solitary Pulmonary Nodule , Humans , Tomography, X-Ray Computed/methods , Lung Neoplasms/diagnostic imaging , Solitary Pulmonary Nodule/diagnostic imaging , Lung , Algorithms , Diagnosis, Computer-Assisted/methods , Radiographic Image Interpretation, Computer-Assisted/methods

Automatic quantitative measurement of left atrial pressure using mitral regurgitation spectrum: clinical study on comparison with floating catheter.

Jin, Yan; Wen, Chao-Yang; Yue, Fengjie; Wang, Huishan; Yin, Liancheng; Zhao, Yang; Mao, Keming; Xin, Fangran.

Eur J Med Res ; 27(1): 217, 2022 Oct 28.

Article in English | MEDLINE | ID: mdl-36307894

ABSTRACT

INTRODUCTION: To explore how to measure LAPEq accurately and quantitatively, that is, the left atrial pressure (LAP) measured and calculated by equation method using mitral regurgitation spectrum. METHODS: The mitral regurgitation spectrum, pulmonary arteriolar wedge pressure (PAWP) and invasive arterial systolic pressure of radial artery of 28 patients were collected simultaneously, including 3 patients with rheumatic heart disease, 15 patients with mitral valve prolapse and 10 patients with coronary artery bypass grafting, patients with moderate or above aortic stenosis were excluded. LAPBp (Doppler sphygmomanometer method), LAPEq (Equation method) and LAPC (Catheter method) were measured synchronously, and the measurement results of the three methods were compared and analyzed. A special intelligent Doppler spectrum analysis software was self-designed to accurately measure LAPEq. This study had been approved by the ethics committee of the Northern Theater General Hospital (K-2019-17), and applied for clinical trial (No. Chictr 190023812). RESULTS: It was found that there was no significant statistical difference between the measurement results of LAPC and LAPEq (t = 0.954, P = 0.348), and significant correlation between the two methods [r = 0.908(0.844, 0.964), P < 0.001]. Although the measurement results of LAPC and LAPBP are consistent in the condition of non-severe eccentric mitral regurgitation, there are significant differences in the overall case and weak correlation between the two methods [r = 0.210, (-0.101, 0.510), P = 0.090]. In MVP patients with P1 or P3 prolapse, the peak pressure difference of MR was underestimated due to the serious eccentricity of MR, which affected the accuracy of LAPBP measurement. CONCLUSIONS: It was shown that there is a good correlation between LAPEq and LAPC, which verifies that the non-invasive and direct quantitative measurement of left atrial pressure based on mitral regurgitation spectrum is feasible and has a good application prospect.

Subject(s)

Mitral Valve Insufficiency , Humans , Atrial Pressure , Catheters , Echocardiography, Doppler/methods , Mitral Valve Insufficiency/diagnostic imaging , Pulmonary Wedge Pressure

StLiter: A Novel Algorithm to Iteratively Build the Compacted de Bruijn Graph From Many Complete Genomes.

Yu, Changyong; Mao, Keming; Zhao, Yuhai; Chang, Cheng; Wang, Guoren.

IEEE/ACM Trans Comput Biol Bioinform ; 19(4): 2471-2483, 2022.

Article in English | MEDLINE | ID: mdl-33630738

ABSTRACT

Recently, the compacted de Bruijn graph (cDBG) of complete genome sequences was successfully used in read mapping due to its ability to deal with the repetitions in genomes. However, current approaches are not flexible enough to fit frequently building the graphs with different k-mer lengths. Instead of building the graph directly, how can we build the compacted de Bruijin graph of longer k-mer based on the one of short k-mer? In this article, we present StLiter, a novel algorithm to build the compacted de Bruijn graph either directly from genome sequences or indirectly based on the graph of a short k-mer. For 100 simulated human genomes, StLiter can construct the graph of k-mer length 15-18 in 2.5-3.2 hours with maximal â¼70GB memory in the case of without considering the reverese complements of the reference genomes. And it costs 4.5-5.9 hours when considering the reverse complements. In experiments, we compared StLiter with TwoPaCo, the state-of-art method for building the graph, on 4 datasets. For k-mer length 15-18, StLiter can build the graph 5-9 times faster than TwoPaCo using less maximal memory cost. For k-mer length larger than 18, given the graph of a short (k- x)-mer, such as x= 1-2, compared with TwoPaCo building the graph directly, StLiter can also build the graph more efficiently. The source codes of StLiter can be downloaded from web site https://github.com/BioLab-cz/StLiter.

Subject(s)

High-Throughput Nucleotide Sequencing , Software , Algorithms , Genome, Human/genetics , High-Throughput Nucleotide Sequencing/methods , Humans , Sequence Analysis, DNA/methods

An Appraisal of Incremental Learning Methods.

Luo, Yong; Yin, Liancheng; Bai, Wenchao; Mao, Keming.

Entropy (Basel) ; 22(11)2020 Oct 22.

Article in English | MEDLINE | ID: mdl-33286958

ABSTRACT

As a special case of machine learning, incremental learning can acquire useful knowledge from incoming data continuously while it does not need to access the original data. It is expected to have the ability of memorization and it is regarded as one of the ultimate goals of artificial intelligence technology. However, incremental learning remains a long term challenge. Modern deep neural network models achieve outstanding performance on stationary data distributions with batch training. This restriction leads to catastrophic forgetting for incremental learning scenarios since the distribution of incoming data is unknown and has a highly different probability from the old data. Therefore, a model must be both plastic to acquire new knowledge and stable to consolidate existing knowledge. This review aims to draw a systematic review of the state of the art of incremental learning methods. Published reports are selected from Web of Science, IEEEXplore, and DBLP databases up to May 2020. Each paper is reviewed according to the types: architectural strategy, regularization strategy and rehearsal and pseudo-rehearsal strategy. We compare and discuss different methods. Moreover, the development trend and research focus are given. It is concluded that incremental learning is still a hot research area and will be for a long period. More attention should be paid to the exploration of both biological systems and computational models.

An Appraisal of Lung Nodules Automatic Classification Algorithms for CT Images.

Wang, Xinqi; Mao, Keming; Wang, Lizhe; Yang, Peiyi; Lu, Duo; He, Ping.

Sensors (Basel) ; 19(1)2019 Jan 07.

Article in English | MEDLINE | ID: mdl-30621101

ABSTRACT

Lung cancer is one of the most deadly diseases around the world representing about 26% of all cancers in 2017. The five-year cure rate is only 18% despite great progress in recent diagnosis and treatment. Before diagnosis, lung nodule classification is a key step, especially since automatic classification can help clinicians by providing a valuable opinion. Modern computer vision and machine learning technologies allow very fast and reliable CT image classification. This research area has become very hot for its high efficiency and labor saving. The paper aims to draw a systematic review of the state of the art of automatic classification of lung nodules. This research paper covers published works selected from the Web of Science, IEEEXplore, and DBLP databases up to June 2018. Each paper is critically reviewed based on objective, methodology, research dataset, and performance evaluation. Mainstream algorithms are conveyed and generic structures are summarized. Our work reveals that lung nodule classification based on deep learning becomes dominant for its excellent performance. It is concluded that the consistency of the research objective and integration of data deserves more attention. Moreover, collaborative works among developers, clinicians, and other parties should be strengthened.

Subject(s)

Lung Neoplasms/diagnostic imaging , Lung Neoplasms/diagnosis , Radiographic Image Interpretation, Computer-Assisted , Tomography, X-Ray Computed , Algorithms , Databases, Factual , Humans , Lung Neoplasms/classification , Lung Neoplasms/pathology

A Case Study on Attribute Recognition of Heated Metal Mark Image Using Deep Convolutional Neural Networks.

Mao, Keming; Lu, Duo; E, Dazhi; Tan, Zhenhua.

Sensors (Basel) ; 18(6)2018 Jun 07.

Article in English | MEDLINE | ID: mdl-29880774

ABSTRACT

Heated metal mark is an important trace to identify the cause of fire. However, traditional methods mainly focus on the knowledge of physics and chemistry for qualitative analysis and make it still a challenging problem. This paper presents a case study on attribute recognition of the heated metal mark image using computer vision and machine learning technologies. The proposed work is composed of three parts. Material is first generated. According to national standards, actual needs and feasibility, seven attributes are selected for research. Data generation and organization are conducted, and a small size benchmark dataset is constructed. A recognition model is then implemented. Feature representation and classifier construction methods are introduced based on deep convolutional neural networks. Finally, the experimental evaluation is carried out. Multi-aspect testings are performed with various model structures, data augments, training modes, optimization methods and batch sizes. The influence of parameters, recognitio efficiency and execution time are also analyzed. The results show that with a fine-tuned model, the recognition rate of attributes metal type, heating mode, heating temperature, heating duration, cooling mode, placing duration and relative humidity are 0.925, 0.908, 0.835, 0.917, 0.928, 0.805 and 0.92, respectively. The proposed method recognizes the attribute of heated metal mark with preferable effect, and it can be used in practical application.

Lung Nodule Image Classification Based on Local Difference Pattern and Combined Classifier.

Mao, Keming; Deng, Zhuofu.

Comput Math Methods Med ; 2016: 1091279, 2016.

Article in English | MEDLINE | ID: mdl-28053650

ABSTRACT

This paper proposes a novel lung nodule classification method for low-dose CT images. The method includes two stages. First, Local Difference Pattern (LDP) is proposed to encode the feature representation, which is extracted by comparing intensity difference along circular regions centered at the lung nodule. Then, the single-center classifier is trained based on LDP. Due to the diversity of feature distribution for different class, the training images are further clustered into multiple cores and the multicenter classifier is constructed. The two classifiers are combined to make the final decision. Experimental results on public dataset show the superior performance of LDP and the combined classifier.

Subject(s)

Diagnosis, Computer-Assisted/methods , Image Processing, Computer-Assisted/methods , Lung Neoplasms/diagnostic imaging , Solitary Pulmonary Nodule/diagnostic imaging , Tomography, X-Ray Computed , Algorithms , Artificial Intelligence , Cluster Analysis , Humans , Pattern Recognition, Automated , ROC Curve , Radiographic Image Interpretation, Computer-Assisted/methods , Reproducibility of Results , Sensitivity and Specificity

Efficiently mining time-delayed gene expression patterns.

Wang, Guoren; Yin, Linjun; Zhao, Yuhai; Mao, Keming.

IEEE Trans Syst Man Cybern B Cybern ; 40(2): 400-11, 2010 Apr.

Article in English | MEDLINE | ID: mdl-19884096

ABSTRACT

Unlike pattern-based biclustering methods that focus on grouping objects in the same subset of dimensions, in this paper, we propose a novel model of coherent clustering for time-series gene expression data, i.e., time-delayed cluster (td-cluster). Under this model, objects can be coherent in different subsets of dimensions if these objects follow a certain time-delayed relationship. Such a cluster can discover the cycle time of gene expression, which is essential in revealing gene regulatory networks. This paper is the first attempt to mine time-delayed gene expression patterns from microarray data. A novel algorithm is also presented and implemented to mine all significant td-clusters. Our experimental results show following two results: 1) the td-cluster algorithm can detect a significant amount of clusters that were missed by previous models, and these clusters are potentially of high biological significance and 2) the td-cluster model and algorithm can easily be extended to 3-D gene x sample x time data sets to identify 3-D td-clusters.

Subject(s)

Algorithms , Cluster Analysis , Data Mining/methods , Gene Expression Profiling/methods , Pattern Recognition, Automated/methods , Computational Biology/methods , Oligonucleotide Array Sequence Analysis

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL