RESUMO
Early diagnosis of potentially malignant disorders, such as oral epithelial dysplasia, is the most reliable way to prevent oral cancer. Computational algorithms have been used as an auxiliary tool to aid specialists in this process. Usually, experiments are performed on private data, making it difficult to reproduce the results. There are several public datasets of histological images, but studies focused on oral dysplasia images use inaccessible datasets. This prevents the improvement of algorithms aimed at this lesion. This study introduces an annotated public dataset of oral epithelial dysplasia tissue images. The dataset includes 456 images acquired from 30 mouse tongues. The images were categorized among the lesion grades, with nuclear structures manually marked by a trained specialist and validated by a pathologist. Also, experiments were carried out in order to illustrate the potential of the proposed dataset in classification and segmentation processes commonly explored in the literature. Convolutional neural network (CNN) models for semantic and instance segmentation were employed on the images, which were pre-processed with stain normalization methods. Then, the segmented and non-segmented images were classified with CNN architectures and machine learning algorithms. The data obtained through these processes is available in the dataset. The segmentation stage showed the F1-score value of 0.83, obtained with the U-Net model using the ResNet-50 as a backbone. At the classification stage, the most expressive result was achieved with the Random Forest method, with an accuracy value of 94.22%. The results show that the segmentation contributed to the classification results, but studies are needed for the improvement of these stages of automated diagnosis. The original, gold standard, normalized, and segmented images are publicly available and may be used for the improvement of clinical applications of CAD methods on oral epithelial dysplasia tissue images.
Assuntos
Redes Neurais de Computação , Camundongos , Animais , Aprendizado de Máquina , Algoritmos , Neoplasias Bucais/diagnóstico por imagem , Neoplasias Bucais/patologia , Processamento de Imagem Assistida por Computador/métodos , Bases de Dados Factuais , Lesões Pré-Cancerosas/diagnóstico por imagem , Lesões Pré-Cancerosas/patologia , Língua/patologia , Língua/diagnóstico por imagem , Humanos , Mucosa Bucal/patologia , Mucosa Bucal/diagnóstico por imagemRESUMO
Leukemia is a significant health challenge, with high incidence and mortality rates. Computer-aided diagnosis (CAD) has emerged as a promising approach. However, deep-learning methods suffer from the "black box problem", leading to unreliable diagnoses. This research proposes an Explainable AI (XAI) Leukemia classification method that addresses this issue by incorporating a robust White Blood Cell (WBC) nuclei segmentation as a hard attention mechanism. The segmentation of WBC is achieved by combining image processing and U-Net techniques, resulting in improved overall performance. The segmented images are fed into modified ResNet-50 models, where the MLP classifier, activation functions, and training scheme have been tested for leukemia subtype classification. Additionally, we add visual explainability and feature space analysis techniques to offer an interpretable classification. Our segmentation algorithm achieves an Intersection over Union (IoU) of 0.91, in six databases. Furthermore, the deep-learning classifier achieves an accuracy of 99.9% on testing. The Grad CAM methods and clustering space analysis confirm improved network focus when classifying segmented images compared to non-segmented images. Overall, the proposed visual explainable CAD system has the potential to assist physicians in diagnosing leukemia and improving patient outcomes.