ABSTRACT
Visual inspection with acetic acid (VIA) is a pre-cancerous screening program for low-middle-income countries (LMICs). Due to the limited number of oncology-gynecologist clinicians in LMICs, VIA examinations are performed mainly by medical workers. However, the inability of the medical workers to recognize a significant pattern based on cervicograms, VIA examination produces high inter-observer variance and high false-positive rate. This study proposed an automated cervicogram interpretation using explainable convolutional neural networks named "CervicoXNet" to support medical workers decision. The total number of 779 cervicograms was used for the learning process: 487 with VIA ( +) and 292 with VIA ( -). We performed data augmentation process under a geometric transformation scenario, such process produces 7325 cervicogram with VIA ( -) and 7242 cervicogram with VIA ( +). The proposed model outperformed other deep learning models, with 99.22% accuracy, 100% sensitivity, and 98.28% specificity. Moreover, to test the robustness of the proposed model, colposcope images used to validate the model's generalization ability. The results showed that the proposed architecture still produced satisfactory performance, with 98.11% accuracy, 98.33% sensitivity, and 98% specificity. It can be proven that the proposed model has been achieved satisfactory results. To make the prediction results visually interpretable, the results are localized with a heat map in fine-grained pixels using a combination of Grad-CAM and guided backpropagation. CervicoXNet can be used an alternative early screening tool with VIA alone.
Subject(s)
Acetic Acid , Neural Networks, Computer , HumansABSTRACT
Precancerous screening using visual inspection with acetic acid (VIA) is suggested by the World Health Organization (WHO) for low-middle-income countries (LMICs). However, because of the limited number of gynecological oncologist clinicians in LMICs, VIA screening is primarily performed by general clinicians, nurses, or midwives (called medical workers). However, not being able to recognize the significant pathophysiology of human papilloma virus (HPV) infection in terms of the columnar epithelial-cell, squamous epithelial-cell, and white-spot regions with abnormal blood vessels may be further aggravated by VIA screening, which achieves a wide range of sensitivity (49-98%) and specificity (75-91%); this might lead to a false result and high interobserver variances. Hence, the automated detection of the columnar area (CA), subepithelial region of the squamocolumnar junction (SCJ), and acetowhite (AW) lesions is needed to support an accurate diagnosis. This study proposes a mask-RCNN architecture to simultaneously segment, classify, and detect CA and AW lesions. We conducted several experiments using 262 images of VIA+ cervicograms, and 222 images of VIA-cervicograms. The proposed model provided a satisfactory intersection over union performance for the CA of about 63.60%, and AW lesions of about 73.98%. The dice similarity coefficient performance was about 75.67% for the CA and about 80.49% for the AW lesion. It also performed well in cervical-cancer precursor-lesion detection, with a mean average precision of about 86.90% for the CA and of about 100% for the AW lesion, while also achieving 100% sensitivity and 92% specificity. Our proposed model with the instance segmentation approach can segment, detect, and classify cervical-cancer precursor lesions with satisfying performance only from a VIA cervicogram.