Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 3 de 3
Filter
Add more filters










Database
Language
Publication year range
1.
Med Biol Eng Comput ; 61(11): 2895-2919, 2023 Nov.
Article in English | MEDLINE | ID: mdl-37530887

ABSTRACT

Prediction of the stage of cancer plays an important role in planning the course of treatment and has been largely reliant on imaging tools which do not capture molecular events that cause cancer progression. Gene-expression data-based analyses are able to identify these events, allowing RNA-sequence and microarray cancer data to be used for cancer analyses. Breast cancer is the most common cancer worldwide, and is classified into four stages - stages 1, 2, 3, and 4 [2]. While machine learning models have previously been explored to perform stage classification with limited success, multi-class stage classification has not had significant progress. There is a need for improved multi-class classification models, such as by investigating deep learning models. Gene-expression-based cancer data is characterised by the small size of available datasets, class imbalance, and high dimensionality. Class balancing methods must be applied to the dataset. Since all the genes are not necessary for stage prediction, retaining only the necessary genes can improve classification accuracy. The breast cancer samples are to be classified into 4 classes of stages 1 to 4. Invasive ductal carcinoma breast cancer samples are obtained from The Cancer Genome Atlas (TCGA) and Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) datasets and combined. Two class balancing techniques are explored, synthetic minority oversampling technique (SMOTE) and SMOTE followed by random undersampling. A hybrid feature selection pipeline is proposed, with three pipelines explored involving combinations of filter and embedded feature selection methods: Pipeline 1 - minimum-redundancy maximum-relevancy (mRMR) and correlation feature selection (CFS), Pipeline 2 - mRMR, mutual information (MI) and CFS, and Pipeline 3 - mRMR and support vector machine-recursive feature elimination (SVM-RFE). The classification is done using deep learning models, namely deep neural network, convolutional neural network, recurrent neural network, a modified deep neural network, and an AutoKeras generated model. Classification performance post class-balancing and various feature selection techniques show marked improvement over classification prior to feature selection. The best multiclass classification was found to be by a deep neural network post SMOTE and random undersampling, and feature selection using mRMR and recursive feature elimination, with a Cohen-Kappa score of 0.303 and a classification accuracy of 53.1%. For binary classification into early and late-stage cancer, the best performance is obtained by a modified deep neural network (DNN) post SMOTE and random undersampling, and feature selection using mRMR and recursive feature elimination, with an accuracy of 81.0% and a Cohen-Kappa score (CKS) of 0.280. This pipeline also showed improved multiclass classification performance on neuroblastoma cancer data, with a best area under the receiver operating characteristic (auROC) curve score of 0.872, as compared to 0.71 obtained in previous work, an improvement of 22.81%. The results and analysis reveal that feature selection techniques play a vital role in gene-expression data-based classification, and the proposed hybrid feature selection pipeline improves classification performance. Multi-class classification is possible using deep learning models, though further improvement particularly in late-stage classification is necessary and should be explored further.


Subject(s)
Breast Neoplasms , Deep Learning , Humans , Female , Breast Neoplasms/genetics , Transcriptome , Neoplasm Staging , Gene Expression Profiling/methods
2.
Med Biol Eng Comput ; 60(9): 2681-2691, 2022 Sep.
Article in English | MEDLINE | ID: mdl-35834050

ABSTRACT

Deep learning provides the healthcare industry with the ability to analyse data at exceptional speeds without compromising on accuracy. These techniques are applicable to healthcare domain for accurate and timely prediction. Convolutional neural network is a class of deep learning methods which has become dominant in various computer vision tasks and is attracting interest across a variety of domains, including radiology. Lung diseases such as tuberculosis (TB), bacterial and viral pneumonias, and COVID-19 are not predicted accurately due to availability of very few samples for either of the lung diseases. The disease could be easily diagnosed using X-ray or CT scan images. But the number of images available for each of the disease is not as equally as other resulting in imbalance nature of input data. Conventional supervised machine learning methods do not achieve higher accuracy when trained using a lesser amount of COVID-19 data samples. Image data augmentation is a technique that can be used to artificially expand the size of a training dataset by creating modified versions of images in the dataset. Data augmentation helped reduce overfitting when training a deep neural network. The SMOTE (Synthetic Minority Oversampling Technique) algorithm is used for the purpose of balancing the classes. The novelty in this research work is to apply combined data augmentation and class balance techniques before classification of tuberculosis, pneumonia, and COVID-19. The classification accuracy obtained with the proposed multi-level classification after training the model is recorded as 97.4% for TB and pneumonia and 88% for bacterial, viral, and COVID-19 classifications. The proposed multi-level classification method produced is ~8 to ~10% improvement in classification accuracy when compared with the existing methods in this area of research. The results reveal the fact that the proposed system is scalable to growing medical data and classifies lung diseases and its sub-types in less time with higher accuracy.


Subject(s)
COVID-19 , Deep Learning , Lung Diseases , Pneumonia, Viral , Tuberculosis , Humans , Pneumonia, Viral/diagnostic imaging
3.
Sci Total Environ ; 821: 153311, 2022 May 15.
Article in English | MEDLINE | ID: mdl-35065104

ABSTRACT

Natural water sources like ponds, lakes and rivers are facing a great threat because of activities like discharge of untreated industrial effluents, sewage water, wastes, etc. It is mandatory to examine the water quality to ensure that only safe water is available for consumption. Traditional methods of water quality inspection are a cumbersome process and hence, Artificial Intelligence (AI) can be used as a catalyst for this process. AutoDL is an upcoming field to automate deep learning pipelines and enables model creation and interpretation with minimal code. However, it is still in the nascent stage. This work explores the suitability of adopting AutoDL for Water Quality Assessment by drawing a comparison between AutoDL and a conventional models and analysis to foresee the quality of the water, an appropriate class based on Water Quality Index segregating water bodies into different classes. The accuracy of conventional DL is 1.8% higher than that of AutoDL for binary class water data. The accuracy of conventional DL is 1% higher than that of AutoDL for multiclass water data. The accuracy of conventional model was ~98% to ~99% whereas AutoDL method yielded ~96% to ~98%. However, the AutoDL model ease the task of finding the appropriate DL model and proved better efficiency without manual intervention.


Subject(s)
Deep Learning , Water Quality , Artificial Intelligence , Rivers
SELECTION OF CITATIONS
SEARCH DETAIL
...