Your browser doesn't support javascript.
Deep extracted features to support Content-Based Image Retrieval systems in the diagnosis of Covid-19 and Interstitial diseases
International Journal of Computer Assisted Radiology and Surgery ; 17(SUPPL 1):S13-S14, 2022.
Article in English | EMBASE | ID: covidwho-1926067
ABSTRACT
Purpose Coronavirus disease 2019 (Covid-19) may cause dyspnoea, whereas Interstitial Lung Diseases (ILD) may lead to the loss of breathing ability. In both cases, chest X-Ray is typically one of the initial studies to identify the diseases as they are simple and widely available scans, especially in under-development countries. However, the assessment of such images is subject to a high intraobserver variability because it depends on the reader's expertise, which may expose patients to unnecessary investigations and delay the diagnosis. Content-based Image Retrieval (CBIR) tools can bridge such a variability gap by recovering similar past cases to a given reference image from an annotated database and acting as a differential diagnosis CAD-IA system [1]. The main CBIR components are the feature extraction and the query formulation. The former represents the compared images into a space where a distance function can be applied, and the latter relies on the k-Nearest Neighbor (kNN) method to fetch the most similar cases by their distances to the query reference. In this study, we examine the quality of Covid-19 and ILD deep features extracted by a modified VGG-19 Convolutional Neural Network (CNN) [2] following the perspective of the Voronoi frontiers induced by kNN, which is at the core of the CBIR query formulation component. Methods We curated a dataset of annotated chest X-Rays from our PACS/HIS systems following a retrospective study approved by the institutional board. A set of 185 Covid-19 and 307 ILD cases from different patients was selected, being Covid-19 cases confirmed by RT-PCR tests and ILD images included after the analysis of two thoracic radiologists. We also added 381 images of ''Healthy'' lungs (without Covid-19 or ILD) to enrich the dataset. The resulting set includes 873 X-Rays (mean age 60.49 ± 15.21, and 52.58% females). We cast the DICOM images into PNG files by using the Hounsfield conversion and a 256 Gy-scale window. The files were scaled to 224 × 224 images and fed into a modified VGG-19 version we implemented [2]. Our version includes the stack of convolutional layers and five new layers after the block5-pool, namely GlobalAveragePooling2D, BatchNormalization, a dense layer with 128 units and ReLU, a dropout layer with ratio 0.6, and a final dense layer with three neurons for classification. The Adam function was used to minimize cross-entropy, whereas batch size and epochs were set to 36 and 100, respectively. All layers start with ImageNet weights that were frozen until block4-pool so that only the remaining layers were updated. We fed the CNN with images and labels (i.e., {Covid-19, ILD, Healthy}) so that our feature extraction procedure was oriented towards those classes rather than autoencoders. The flattened outputs of the last max-pooling layer were collected as feature vectors of dimensionality d = 512. We clean and preprocess those vectors before applying the kNN-based search mechanism. First, we scaled the dimensions into the [0,1] interval. Then, we perform a reduction by using the Principal Component Analysis (PCA). The number of reduced dimensions was determined by the intrinsic dimensionality of the features, estimated by the mean (l) and standard deviation (r) of the pairwise distance distribution as the value μ2/2.σ2. Finally, the reduced vectors were also scaled into the [0,1] interval. The experiments were performed in a 3854 core 1.5 GHz GPU NVidia TitanX 12 GB RAM, and an Intel(R) Xeon(R) CPU 2.00 GHz, 96 GB RAM. The code was implemented under Tensorflow (v.2.1.0) and R (v4.1.2). Results We used two Principal Components to reduce the vectors according to the estimated intrinsic dimensionality. Figure 1 shows the Voronoi frontiers induced by kNN with a smooth separation between the three classes, which creates a search space in which CBIR searches are expected to be accurate. We quantify such behavior through a kNNbased classification on the two experimental settings (i.e., 10-folds and Holdout) by using the scaled features with and without dimensionality reductio . able 1 summarizes the results with the following

findings:

• The accuracy measures increased with the neighborhood (k = 1 vs. k = 5) in all experimental cases, • Covid-19 cases were more difficult to label than ILD according to F1 and RC, • The kNN hit-ratio (TP) for Covid-19 was comparable to the very first diagnosis stored into the PACS/HIS systems by readers on duty regarding the Holdout cases (readers' mean ∗ 63% vs. KNN ∼ 59%), • Searches over the reduced data were ∼ 4 9 faster, and • While dimensionality reduction was just as suitable as nonreduced data in the 10-folds evaluation, it expressively enhanced the kNN performance for the Holdout test (e.g., 0.68 vs. 0.82, k = 1 and F1). This result shows the side-effects of searching high-dimensional spaces with kNN (the ''curse of dimensionality''), which requires pre-processing the vectors or defining other query criteria to browse the data. Conclusion This study has discussed feature extraction for Covid-19 and ILD images from the perspective of kNN queries, the query formulation component within CBIR systems. Although we used cross-validation and one external batch to mitigate overfitting, a practical limitation was the size of the CNN training set. Still, our approach showed promising results in the extraction of suitable features for CBIR environments.
Keywords

Full text: Available Collection: Databases of international organizations Database: EMBASE Language: English Journal: International Journal of Computer Assisted Radiology and Surgery Year: 2022 Document Type: Article

Similar

MEDLINE

...
LILACS

LIS


Full text: Available Collection: Databases of international organizations Database: EMBASE Language: English Journal: International Journal of Computer Assisted Radiology and Surgery Year: 2022 Document Type: Article