Your browser doesn't support javascript.
loading
A comparative study of machine learning and deep learning algorithms to classify cancer types based on microarray gene expression data.
Tabares-Soto, Reinel; Orozco-Arias, Simon; Romero-Cano, Victor; Segovia Bucheli, Vanesa; Rodríguez-Sotelo, José Luis; Jiménez-Varón, Cristian Felipe.
Afiliação
  • Tabares-Soto R; Department of Electronics and Automation, Universidad Autónoma de Manizales, Manizales, Caldas, Colombia.
  • Orozco-Arias S; Department of Computer Science, Universidad Autónoma de Manizales, Manizales, Caldas, Colombia.
  • Romero-Cano V; Department of Systems and informatics, Universidad de Caldas, Manizales, Caldas, Colombia.
  • Segovia Bucheli V; Department of Automatics and Electronics, Universidad Autónoma de Occidente, Cali, Valle del Cauca, Colombia.
  • Rodríguez-Sotelo JL; Izmir International Biomedicine and Genome Institute, Dokuz Eylül University, Izmir, Turkey.
  • Jiménez-Varón CF; Department of Electronics and Automation, Universidad Autónoma de Manizales, Manizales, Caldas, Colombia.
PeerJ Comput Sci ; 6: e270, 2020.
Article em En | MEDLINE | ID: mdl-33816921
Cancer classification is a topic of major interest in medicine since it allows accurate and efficient diagnosis and facilitates a successful outcome in medical treatments. Previous studies have classified human tumors using a large-scale RNA profiling and supervised Machine Learning (ML) algorithms to construct a molecular-based classification of carcinoma cells from breast, bladder, adenocarcinoma, colorectal, gastro esophagus, kidney, liver, lung, ovarian, pancreas, and prostate tumors. These datasets are collectively known as the 11_tumor database, although this database has been used in several works in the ML field, no comparative studies of different algorithms can be found in the literature. On the other hand, advances in both hardware and software technologies have fostered considerable improvements in the precision of solutions that use ML, such as Deep Learning (DL). In this study, we compare the most widely used algorithms in classical ML and DL to classify the tumors described in the 11_tumor database. We obtained tumor identification accuracies between 90.6% (Logistic Regression) and 94.43% (Convolutional Neural Networks) using k-fold cross-validation. Also, we show how a tuning process may or may not significantly improve algorithms' accuracies. Our results demonstrate an efficient and accurate classification method based on gene expression (microarray data) and ML/DL algorithms, which facilitates tumor type prediction in a multi-cancer-type scenario.
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Tipo de estudo: Prognostic_studies Idioma: En Revista: PeerJ Comput Sci Ano de publicação: 2020 Tipo de documento: Article País de afiliação: Colômbia País de publicação: Estados Unidos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Tipo de estudo: Prognostic_studies Idioma: En Revista: PeerJ Comput Sci Ano de publicação: 2020 Tipo de documento: Article País de afiliação: Colômbia País de publicação: Estados Unidos