Search | VHL Regional Portal

1.

Automated detection of steno-occlusive lesion on time-of-flight magnetic resonance angiography: an observer performance study.

Lim, Hunjong; Choi, Dongjun; Sunwoo, Leonard; Jung, Jae Hyeop; Baik, Sung Hyun; Cho, Se Jin; Jang, Jinhee; Kim, Tackeun; Lee, Kyong Joon.

AJNR Am J Neuroradiol ; 2024 May 07.

Article in English | MEDLINE | ID: mdl-38719612

ABSTRACT

BACKGROUND AND PURPOSE: Intracranial steno-occlusive lesions are responsible for acute ischemic stroke. However, the clinical benefits of artificial intelligence-based methods for detecting pathologic lesions in intracranial arteries have not been evaluated. We aimed to validate the clinical utility of an artificial intelligence model for detecting steno-occlusive lesions in the intracranial arteries. MATERIALS AND METHODS: Overall, 138 TOF-MRA images were collected from two institutions, which served as internal (n = 62) and external (n = 76) test sets, respectively. Each study was reviewed by five radiologists (two neuroradiologists and three radiology residents) to compare the usage and non-usage of our proposed artificial intelligence model for TOF-MRA interpretation. They identified the steno-occlusive lesions and recorded their reading time. Observer performance was assessed using the area under the Jackknife free-response receiver operating characteristic curve and reading time for comparison. RESULTS: The average area under the Jackknife free-response receiver operating characteristic curve for the five radiologists demonstrated an improvement from 0.70 without artificial intelligence to 0.76 with artificial intelligence (P = .027). Notably, this improvement was most pronounced among the three radiology residents, whose performance metrics increased from 0.68 to 0.76 (P = .002). Despite an increased reading time upon using artificial intelligence, there was no significant change among the readings by radiology residents. Moreover, the use of artificial intelligence resulted in improved inter-observer agreement among the reviewers (the intraclass correlation coefficient increased from 0.734 to 0.752). CONCLUSIONS: Our proposed artificial intelligence model offers a supportive tool for radiologists, potentially enhancing the accuracy of detecting intracranial steno-occlusion lesions on TOF-MRA. Less-experienced readers may benefit the most from this model.ABBREVIATIONS: AI = Artificial intelligence; AUC = Area under the receiver operating characteristic curve; AUFROC = Area under the Jackknife free-response receiver operating characteristic curve; DL = Deep learning; ICC = Intraclass correlation coefficient; IRB = Institutional Review Boards; JAFROC = Jackknife free-response receiver operating characteristic.

2.

Usefulness of longitudinal nodule-matching algorithm in computer-aided diagnosis of new pulmonary metastases on cancer surveillance CT scans.

Yoon, Sung Hyun; Oh, Dong Yul; Kim, Hyo Jin; Jang, Sowon; Kim, Minseon; Kim, Jihang; Lee, Kyung Won; Lee, Kyong Joon; Kim, Junghoon.

Quant Imaging Med Surg ; 14(2): 1493-1506, 2024 Feb 01.

Article in English | MEDLINE | ID: mdl-38415154

ABSTRACT

Background: Detecting new pulmonary metastases by comparing serial computed tomography (CT) scans is crucial, but a repetitive and time-consuming task that burdens the radiologists' workload. This study aimed to evaluate the usefulness of a nodule-matching algorithm with deep learning-based computer-aided detection (DL-CAD) in diagnosing new pulmonary metastases on cancer surveillance CT scans. Methods: Among patients who underwent pulmonary metastasectomy between 2014 and 2018, 65 new pulmonary metastases missed by interpreting radiologists on cancer surveillance CT (Time 2) were identified after a retrospective comparison with the previous CT (Time 1). First, DL-CAD detected nodules in Time 1 and Time 2 CT images. All nodules detected at Time 2 were initially considered metastasis candidates. Second, the nodule-matching algorithm was used to assess the correlation between the nodules from the two CT scans and to classify the nodules at Time 2 as "new" or "pre-existing". Pre-existing nodules were excluded from metastasis candidates. We evaluated the performance of DL-CAD with the nodule-matching algorithm, based on its sensitivity, false-metastasis candidates per scan, and positive predictive value (PPV). Results: A total of 475 lesions were detected by DL-CAD at Time 2. Following a radiologist review, the lesions were categorized as metastases (n=54), benign nodules (n=392), and non-nodules (n=29). Upon comparison of nodules at Time 1 and 2 using the nodule-matching algorithm, all metastases were classified as new nodules without any matching errors. Out of 421 benign lesions, 202 (48.0%) were identified as pre-existing and subsequently excluded from the pool of metastasis candidates through the nodule-matching algorithm. As a result, false-metastasis candidates per CT scan decreased by 47.9% (from 7.1 to 3.7, P<0.001) and the PPV increased from 11.4% to 19.8% (P<0.001), while maintaining sensitivity. Conclusions: The nodule-matching algorithm improves the diagnostic performance of DL-CAD for new pulmonary metastases, by lowering the number of false-metastasis candidates without compromising sensitivity.

3.

Application of symmetry evaluation to deep learning algorithm in detection of mastoiditis on mastoid radiographs.

Choi, Dongjun; Sunwoo, Leonard; You, Sung-Hye; Lee, Kyong Joon; Ryoo, Inseon.

Sci Rep ; 13(1): 5337, 2023 04 01.

Article in English | MEDLINE | ID: mdl-37005429

ABSTRACT

As many human organs exist in pairs or have symmetric appearance and loss of symmetry may indicate pathology, symmetry evaluation on medical images is very important and has been routinely performed in diagnosis of diseases and pretreatment evaluation. Therefore, applying symmetry evaluation function to deep learning algorithms in interpreting medical images is essential, especially for the organs that have significant inter-individual variation but bilateral symmetry in a person, such as mastoid air cells. In this study, we developed a deep learning algorithm to detect bilateral mastoid abnormalities simultaneously on mastoid anterior-posterior (AP) views with symmetry evaluation. The developed algorithm showed better diagnostic performance in diagnosing mastoiditis on mastoid AP views than the algorithm trained by single-side mastoid radiographs without symmetry evaluation and similar to superior diagnostic performance to head and neck radiologists. The results of this study show the possibility of evaluating symmetry in medical images with deep learning algorithms.

Subject(s)

Deep Learning , Mastoiditis , Humans , Mastoiditis/diagnostic imaging , Mastoid/diagnostic imaging , Radiography , Algorithms , Retrospective Studies

4.

Intracranial steno-occlusive lesion detection on time-of-flight MR angiography using multi-task learning.

Choi, Dongjun; Kim, Tackeun; Jang, Jinhee; Sunwoo, Leonard; Lee, Kyong Joon.

Comput Med Imaging Graph ; 107: 102220, 2023 07.

Article in English | MEDLINE | ID: mdl-37023509

ABSTRACT

Steno-occlusive lesions in intracranial arteries refer to segments of narrowed or occluded blood vessels that increase the risk of ischemic strokes. Steno-occlusive lesion detection is crucial in clinical settings; however, automatic detection methods have hardly been studied. Therefore, we propose a novel automatic method to detect steno-occlusive lesions in sequential transverse slices on time-of-flight magnetic resonance angiography. Our method simultaneously detects lesions while segmenting blood vessels based on end-to-end multi-task learning, reflecting that the lesions are closely related to the connectivity of blood vessels. We design classification and localization modules that can be attached to arbitrary segmentation network. As blood vessels are segmented, both modules simultaneously predict the presence and location of lesions for each transverse slice. By combining outputs from the two modules, we devise a simple operation that boosts the performance of lesion localization. Experimental results show that lesion prediction and localization performance is improved by incorporating blood vessel extraction. Our ablation study demonstrates that the proposed operation enhances lesion localization accuracy. We also verify the effectiveness of multi-task learning by comparing our approach with those that individually detect lesions with extracted blood vessels.

Subject(s)

Learning , Magnetic Resonance Angiography , Magnetic Resonance Angiography/methods

5.

Tumor grading of soft tissue sarcomas: Assessment with whole-tumor histogram analysis of apparent diffusion coefficient.

Kim, Bo Ram; Kang, Yusuhn; Lee, Jaehyung; Choi, Dongjun; Lee, Kyong Joon; Ahn, Joong Mo; Lee, Eugene; Lee, Joon Woo; Kang, Heung Sik.

Eur J Radiol ; 151: 110319, 2022 Jun.

Article in English | MEDLINE | ID: mdl-35452952

ABSTRACT

PURPOSE: To evaluate the usefulness of whole-tumor ADC histogram analysis based on entire tumor volume in determining the histologic grade of STS (soft tissue sarcoma)s. METHODS: From January 2015 to December 2020, 53 patients with STS who underwent preoperative magnetic resonance imaging, including diffusion weighted imaging and ADC maps (b = 0 and 1400 s/mm2), within 1 month before surgical resection were included in the study. Regions of interest were drawn on every section of the ADC map containing tumor and were summated to derive volume-based histogram data of the entire tumor. Histogram parameters were correlated with histologic tumor grade using Kruskal-Wallis test and compared between high-(grade II and III) and low-grade STSs (grade I) using Mann-Whitney U test. Multivariable logistic regression analysis was applied to identify significant histogram parameters for high-grade STS prediction, and receiver operating characteristic curves (AUC) were constructed to determine optimum threshold. RESULTS: Eight patients with low-grade STS (15.1%) and 45 with high-grade STS (26.4% [14/53] for grade II; 58.5% [31/53] for grade III) were included. High-grade STS showed positive skewness and low-grade STS showed negative skewness (0.503 vs -0.726, p=.001). High-grade STS showed lower mean ADC (p =.03) and 5th to 50th percentile values (p ≤. 03) than those of low-grade STS. Positive skewness was an independent predictor of high-grade STS (odds ratio: 6.704, p=.002) with 84.4% sensitivity and 87.5% specificity (cut-off values > -0.1757, AUC = 0.842). CONCLUSION: Skewness is the most promising histogram parameter for discriminating high-grade from low-grade STS. The mean ADC values and lower half of percentile values are helpful for differentiating high from low-grade STSs.

Subject(s)

Sarcoma , Soft Tissue Neoplasms , Diffusion Magnetic Resonance Imaging/methods , Humans , Magnetic Resonance Imaging , Neoplasm Grading , Retrospective Studies , Sarcoma/diagnostic imaging , Sensitivity and Specificity , Soft Tissue Neoplasms/diagnostic imaging

6.

Volumetric analysis of pulmonary nodules: reducing the discrepancy between the diameter-based volume calculation and voxel-counting method.

Yoon, Sung Hyun; Kim, Jihang; Lee, Kyong Joon; Nam, Chang-Mo; Kim, Junghoon; Lee, Kyung Hee; Lee, Kyung Won.

Quant Imaging Med Surg ; 12(3): 1674-1683, 2022 Mar.

Article in English | MEDLINE | ID: mdl-35284294

ABSTRACT

Background: When assessing the volume of pulmonary nodules on computed tomography (CT) images, there is an inevitable discrepancy between values based on the diameter-based volume calculation and the voxel-counting method, which is derived from the Euclidean distance measurement method on pixel/voxel-based digital image. We aimed to evaluate the ability of a modified diameter measurement method to reduce the discrepancy, and we determined a conversion equation to equate volumes derived from different methods. Methods: Two different anthropomorphic phantoms with subsolid and solid nodules were repeatedly scanned under various settings. Nodules in CT images were detected and segmented using a fully automated algorithm and the volume was calculated using three methods: the voxel-counting method (Vvc ), diameter-based volume calculation (Vd ), and a modified diameter-based volume calculation (Vd+ 1), in which one pixel spacing was added to the diameters in the three axes (x-, y-, and z-axis). For each nodule, Vd and Vd +1 were compared to Vvc by computing the absolute percentage error (APE) as follows: APE =100 × (V - Vvc )/Vvc . Comparisons between APEd and APEd+1 according to CT parameter setting were performed using the Wilcoxon signed-rank test. The Jonckheere-Terpstra test was used to evaluate trends across the four different nodule sizes. Results: The deep learning-based computer-aided diagnosis (DL-CAD) successfully detected and segmented all nodules in a fully automatic manner. The APE was significantly less with Vd+1 than with Vd (Wilcoxon signed-rank test, P<0.05) regardless of CT parameters and nodule size. The APE median increased as the size of the nodule decreased. This trend was statistically significant (Jonckheere-Terpstra test, P<0.001) regardless of volume measurement method (diameter-based and modified diameter-based volume calculations). Conclusions: Our modified diameter-based volume calculation significantly reduces the discrepancy between the diameter-based volume calculation and voxel-counting method.

7.

Evaluating subscapularis tendon tears on axillary lateral radiographs using deep learning.

Kang, Yusuhn; Choi, Dongjun; Lee, Kyong Joon; Oh, Joo Han; Kim, Bo Ram; Ahn, Joong Mo.

Eur Radiol ; 31(12): 9408-9417, 2021 Dec.

Article in English | MEDLINE | ID: mdl-34014379

ABSTRACT

OBJECTIVE: To develop a deep learning algorithm capable of evaluating subscapularis tendon (SSC) tears based on axillary lateral shoulder radiography. METHODS: A total of 2,779 axillary lateral shoulder radiographs (performed between February 2010 and December 2018) and the patients' corresponding clinical information (age, sex, dominant side, history of trauma, and degree of pain) were used to develop the deep learning algorithm. The radiographs were labeled based on arthroscopic findings, with the output being the probability of an SSC tear exceeding 50% of the tendon's thickness. The algorithm's performance was evaluated by determining the area under the receiver operating characteristic curve (AUC), sensitivity, specificity, negative predictive value (NPV), and negative likelihood ratio (LR-) at a predefined high-sensitivity cutoff point. Two different test sets were used, with radiographs obtained between January and December 2019; Test Set 1 used arthroscopic findings as the reference standard (n = 340), whereas Test Set 2 used MRI findings as the reference standard (n = 627). RESULTS: The AUCs were 0.83 (95% confidence interval, 0.79-0.88) and 0.82 (95% confidence interval, 0.79-0.86) for Test Sets 1 and 2, respectively. At the high-sensitivity cutoff point, the sensitivity, NPV, and LR- were 91.4%, 90.4%, and 0.21 in Test Set 1, and 90.2%, 89.5%, and 0.21 in Test Set 2, respectively. Gradient-weighted Class Activation Mapping identified the subscapularis insertion site at the lesser tuberosity as the most sensitive region. CONCLUSION: Our deep learning algorithm is capable of assessing SSC tears based on changes at the lesser tuberosity on axillary lateral radiographs with moderate accuracy. KEY POINTS: â¢ We have developed a deep learning algorithm capable of assessing SSC tears based on changes at the lesser tuberosity on axillary lateral radiographs and previous clinical data with moderate accuracy. â¢ Our deep learning algorithm could be used as an objective method to initially assess SSC integrity and to identify those who would and would not benefit from further investigation or treatment.

Subject(s)

Deep Learning , Rotator Cuff Injuries , Arthroscopy , Humans , Radiography , Retrospective Studies , Rotator Cuff , Rotator Cuff Injuries/diagnostic imaging

8.

Pre-existing and machine learning-based models for cardiovascular risk prediction.

Cho, Sang-Yeong; Kim, Sun-Hwa; Kang, Si-Hyuck; Lee, Kyong Joon; Choi, Dongjun; Kang, Seungjin; Park, Sang Jun; Kim, Tackeun; Yoon, Chang-Hwan; Youn, Tae-Jin; Chae, In-Ho.

Sci Rep ; 11(1): 8886, 2021 04 26.

Article in English | MEDLINE | ID: mdl-33903629

ABSTRACT

Predicting the risk of cardiovascular disease is the key to primary prevention. Machine learning has attracted attention in analyzing increasingly large, complex healthcare data. We assessed discrimination and calibration of pre-existing cardiovascular risk prediction models and developed machine learning-based prediction algorithms. This study included 222,998 Korean adults aged 40-79 years, naïve to lipid-lowering therapy, had no history of cardiovascular disease. Pre-existing models showed moderate to good discrimination in predicting future cardiovascular events (C-statistics 0.70-0.80). Pooled cohort equation (PCE) specifically showed C-statistics of 0.738. Among other machine learning models such as logistic regression, treebag, random forest, and adaboost, the neural network model showed the greatest C-statistic (0.751), which was significantly higher than that for PCE. It also showed improved agreement between the predicted risk and observed outcomes (Hosmer-Lemeshow χ2 = 86.1, P < 0.001) than PCE for whites did (Hosmer-Lemeshow χ2 = 171.1, P < 0.001). Similar improvements were observed for Framingham risk score, systematic coronary risk evaluation, and QRISK3. This study demonstrated that machine learning-based algorithms could improve performance in cardiovascular risk prediction over contemporary cardiovascular risk models in statin-naïve healthy Korean adults without cardiovascular disease. The model can be easily adopted for risk assessment and clinical decision making.

Subject(s)

Cardiovascular Diseases/diagnosis , Machine Learning , Models, Cardiovascular , Adult , Aged , Female , Humans , Male , Middle Aged , Risk Assessment , Risk Factors

9.

Can Deep Learning Using Weight Bearing Knee Anterio-Posterior Radiograph Alone Replace a Whole-Leg Radiograph in the Interpretation of Weight Bearing Line Ratio?

Moon, Hyun-Doo; Choi, Han-Gyeol; Lee, Kyong-Joon; Choi, Dong-Jun; Yoo, Hyun-Jin; Lee, Yong-Seuk.

J Clin Med ; 10(8)2021 Apr 19.

Article in English | MEDLINE | ID: mdl-33921685

ABSTRACT

Weight bearing whole-leg radiograph (WLR) is essential to assess lower limb alignment such as weight bearing line (WBL) ratio. The purpose of this study was to develop a deep learning (DL) model that predicts the WBL ratio using knee standing AP alone. Total of 3997 knee AP & WLRs were used. WBL ratio was used for labeling and analysis of prediction accuracy. The WBL ratio was divided into seven categories (0, 0.1, 0.2, 0.3, 0.4, 0.5, and 0.6). After training, performance of the DL model was evaluated. Final performance was evaluated using 386 subjects as a test set. Cumulative score (CS) within error range 0.1 was set with showing maximum CS in the validation set (95% CI, 0.924-0.970). In the test set, mean absolute error was 0.054 (95% CI, 0.048-0.061) and CS was 0.951 (95% CI, 0.924-0.970). Developed DL algorithm could predict the WBL ratio on knee standing AP alone with comparable accuracy as the degree primary physician can assess the alignment. It can be the basis for developing an automated lower limb alignment assessment tool that can be used easily and cost-effectively in primary clinics.

10.

Deep Learning for Diagnosis of Paranasal Sinusitis Using Multi-View Radiographs.

Jeon, Yejin; Lee, Kyeorye; Sunwoo, Leonard; Choi, Dongjun; Oh, Dong Yul; Lee, Kyong Joon; Kim, Youngjune; Kim, Jeong-Whun; Cho, Se Jin; Baik, Sung Hyun; Yoo, Roh-Eul; Bae, Yun Jung; Choi, Byung Se; Jung, Cheolkyu; Kim, Jae Hyoung.

Diagnostics (Basel) ; 11(2)2021 Feb 05.

Article in English | MEDLINE | ID: mdl-33562764

ABSTRACT

Accurate image interpretation of Waters' and Caldwell view radiographs used for sinusitis screening is challenging. Therefore, we developed a deep learning algorithm for diagnosing frontal, ethmoid, and maxillary sinusitis on both Waters' and Caldwell views. The datasets were selected for the training and validation set (n = 1403, sinusitis% = 34.3%) and the test set (n = 132, sinusitis% = 29.5%) by temporal separation. The algorithm can simultaneously detect and classify each paranasal sinus using both Waters' and Caldwell views without manual cropping. Single- and multi-view models were compared. Our proposed algorithm satisfactorily diagnosed frontal, ethmoid, and maxillary sinusitis on both Waters' and Caldwell views (area under the curve (AUC), 0.71 (95% confidence interval, 0.62-0.80), 0.78 (0.72-0.85), and 0.88 (0.84-0.92), respectively). The one-sided DeLong's test was used to compare the AUCs, and the Obuchowski-Rockette model was used to pool the AUCs of the radiologists. The algorithm yielded a higher AUC than radiologists for ethmoid and maxillary sinusitis (p = 0.012 and 0.013, respectively). The multi-view model also exhibited a higher AUC than the single Waters' view model for maxillary sinusitis (p = 0.038). Therefore, our algorithm showed diagnostic performances comparable to radiologists and enhanced the value of radiography as a first-line imaging modality in assessing multiple sinusitis.

11.

Incidence Lung Cancer after a Negative CT Screening in the National Lung Screening Trial: Deep Learning-Based Detection of Missed Lung Cancers.

Cho, Jungheum; Kim, Jihang; Lee, Kyong Joon; Nam, Chang Mo; Yoon, Sung Hyun; Song, Hwayoung; Kim, Junghoon; Choi, Ye Ra; Lee, Kyung Hee; Lee, Kyung Won.

J Clin Med ; 9(12)2020 Dec 02.

Article in English | MEDLINE | ID: mdl-33276433

ABSTRACT

We aimed to analyse the CT examinations of the previous screening round (CTprev) in NLST participants with incidence lung cancer and evaluate the value of DL-CAD in detection of missed lung cancers. Thoracic radiologists reviewed CTprev in participants with incidence lung cancer, and a DL-CAD analysed CTprev according to NLST criteria and the lung CT screening reporting & data system (Lung-RADS) classification. We calculated patient-wise and lesion-wise sensitivities of the DL-CAD in detection of missed lung cancers. As per the NLST criteria, 88% (100/113) of CTprev were positive and 74 of them had missed lung cancers. The DL-CAD reported 98% (98/100) of the positive screens as positive and detected 95% (70/74) of the missed lung cancers. As per the Lung-RADS classification, 82% (93/113) of CTprev were positive and 60 of them had missed lung cancers. The DL-CAD reported 97% (90/93) of the positive screens as positive and detected 98% (59/60) of the missed lung cancers. The DL-CAD made false positive calls in 10.3% (27/263) of controls, with 0.16 false positive nodules per scan (41/263). In conclusion, the majority of CTprev in participants with incidence lung cancers had missed lung cancers, and the DL-CAD detected them with high sensitivity and a limited false positive rate.

12.

Performance of deep learning to detect mastoiditis using multiple conventional radiographs of mastoid.

Lee, Kyong Joon; Ryoo, Inseon; Choi, Dongjun; Sunwoo, Leonard; You, Sung-Hye; Jung, Hye Na.

PLoS One ; 15(11): e0241796, 2020.

Article in English | MEDLINE | ID: mdl-33176335

ABSTRACT

OBJECTIVES: This study aimed to compare the diagnostic performance of deep learning algorithm trained by single view (anterior-posterior (AP) or lateral view) with that trained by multiple views (both views together) in diagnosis of mastoiditis on mastoid series and compare the diagnostic performance between the algorithm and radiologists. METHODS: Total 9,988 mastoid series (AP and lateral views) were classified as normal or abnormal (mastoiditis) based on radiographic findings. Among them 792 image sets with temporal bone CT were classified as the gold standard test set and remaining sets were randomly divided into training (n = 8,276) and validation (n = 920) sets by 9:1 for developing a deep learning algorithm. Temporal (n = 294) and geographic (n = 308) external test sets were also collected. Diagnostic performance of deep learning algorithm trained by single view was compared with that trained by multiple views. Diagnostic performance of the algorithm and two radiologists was assessed. Inter-observer agreement between the algorithm and radiologists and between two radiologists was calculated. RESULTS: Area under the receiver operating characteristic curves of algorithm using multiple views (0.971, 0.978, and 0.965 for gold standard, temporal, and geographic external test sets, respectively) showed higher values than those using single view (0.964/0.953, 0.952/0.961, and 0.961/0.942 for AP view/lateral view of gold standard, temporal external, and geographic external test sets, respectively) in all test sets. The algorithm showed statistically significant higher specificity compared with radiologists (p = 0.018 and 0.012). There was substantial agreement between the algorithm and two radiologists and between two radiologists (κ = 0.79, 0.8, and 0.76). CONCLUSION: The deep learning algorithm trained by multiple views showed better performance than that trained by single view. The diagnostic performance of the algorithm for detecting mastoiditis on mastoid series was similar to or higher than that of radiologists.

Subject(s)

Mastoid/pathology , Mastoiditis/diagnosis , Algorithms , Deep Learning , Humans , Mastoid/diagnostic imaging , Mastoiditis/diagnostic imaging , ROC Curve , Retrospective Studies

13.

Can Additional Patient Information Improve the Diagnostic Performance of Deep Learning for the Interpretation of Knee Osteoarthritis Severity.

Kim, Dong Hyun; Lee, Kyong Joon; Choi, Dongjun; Lee, Jae Ik; Choi, Han Gyeol; Lee, Yong Seuk.

J Clin Med ; 9(10)2020 Oct 18.

Article in English | MEDLINE | ID: mdl-33080993

ABSTRACT

The study compares the diagnostic performance of deep learning (DL) with that of the former radiologist reading of the Kellgren-Lawrence (KL) grade and evaluates whether additional patient data can improve the diagnostic performance of DL. From March 2003 to February 2017, 3000 patients with 4366 knee AP radiographs were randomly selected. DL was trained using knee images and clinical information in two stages. In the first stage, DL was trained only with images and then in the second stage, it was trained with image data and clinical information. In the test set of image data, the areas under the receiver operating characteristic curve (AUC)s of the DL algorithm in diagnosing KL 0 to KL 4 were 0.91 (95% confidence interval (CI), 0.88-0.95), 0.80 (95% CI, 0.76-0.84), 0.69 (95% CI, 0.64-0.73), 0.86 (95% CI, 0.83-0.89), and 0.96 (95% CI, 0.94-0.98), respectively. In the test set with image data and additional patient information, the AUCs of the DL algorithm in diagnosing KL 0 to KL 4 were 0.97 (95% confidence interval (CI), 0.71-0.74), 0.85 (95% CI, 0.80-0.86), 0.75 (95% CI, 0.66-0.73), 0.86 (95% CI, 0.79-0.85), and 0.95 (95% CI, 0.91-0.97), respectively. The diagnostic performance of image data with additional patient information showed a statistically significantly higher AUC than image data alone in diagnosing KL 0, 1, and 2 (p-values were 0.008, 0.020, and 0.027, respectively).The diagnostic performance of DL was comparable to that of the former radiologist reading of the knee osteoarthritis KL grade. Additional patient information improved DL diagnosis in interpreting early knee osteoarthritis.

14.

Effects of Hypertension, Diabetes, and Smoking on Age and Sex Prediction from Retinal Fundus Images.

Kim, Yong Dae; Noh, Kyoung Jin; Byun, Seong Jun; Lee, Soochahn; Kim, Tackeun; Sunwoo, Leonard; Lee, Kyong Joon; Kang, Si-Hyuck; Park, Kyu Hyung; Park, Sang Jun.

Sci Rep ; 10(1): 4623, 2020 03 12.

Article in English | MEDLINE | ID: mdl-32165702

ABSTRACT

Retinal fundus images are used to detect organ damage from vascular diseases (e.g. diabetes mellitus and hypertension) and screen ocular diseases. We aimed to assess convolutional neural network (CNN) models that predict age and sex from retinal fundus images in normal participants and in participants with underlying systemic vascular-altered status. In addition, we also tried to investigate clues regarding differences between normal ageing and vascular pathologic changes using the CNN models. In this study, we developed CNN age and sex prediction models using 219,302 fundus images from normal participants without hypertension, diabetes mellitus (DM), and any smoking history. The trained models were assessed in four test-sets with 24,366 images from normal participants, 40,659 images from hypertension participants, 14,189 images from DM participants, and 113,510 images from smokers. The CNN model accurately predicted age in normal participants; the correlation between predicted age and chronologic age was R2 = 0.92, and the mean absolute error (MAE) was 3.06 years. MAEs in test-sets with hypertension (3.46 years), DM (3.55 years), and smoking (2.65 years) were similar to that of normal participants; however, R2 values were relatively low (hypertension, R2 = 0.74; DM, R2 = 0.75; smoking, R2 = 0.86). In subgroups with participants over 60 years, the MAEs increased to above 4.0 years and the accuracies declined for all test-sets. Fundus-predicted sex demonstrated acceptable accuracy (area under curve > 0.96) in all test-sets. Retinal fundus images from participants with underlying vascular-altered conditions (hypertension, DM, or smoking) indicated similar MAEs and low coefficients of determination (R2) between the predicted age and chronologic age, thus suggesting that the ageing process and pathologic vascular changes exhibit different features. Our models demonstrate the most improved performance yet and provided clues to the relationship and difference between ageing and pathologic changes from underlying systemic vascular conditions. In the process of fundus change, systemic vascular diseases are thought to have a different effect from ageing. Research in context. Evidence before this study. The human retina and optic disc continuously change with ageing, and they share physiologic or pathologic characteristics with brain and systemic vascular status. As retinal fundus images provide high-resolution in-vivo images of retinal vessels and parenchyma without any invasive procedure, it has been used to screen ocular diseases and has attracted significant attention as a predictive biomarker for cerebral and systemic vascular diseases. Recently, deep neural networks have revolutionised the field of medical image analysis including retinal fundus images and shown reliable results in predicting age, sex, and presence of cardiovascular diseases. Added value of this study. This is the first study demonstrating how a convolutional neural network (CNN) trained using retinal fundus images from normal participants measures the age of participants with underlying vascular conditions such as hypertension, diabetes mellitus (DM), or history of smoking using a large database, SBRIA, which contains 412,026 retinal fundus images from 155,449 participants. Our results indicated that the model accurately predicted age in normal participants, while correlations (coefficient of determination, R2) in test-sets with hypertension, DM, and smoking were relatively low. Additionally, a subgroup analysis indicated that mean absolute errors (MAEs) increased and accuracies declined significantly in subgroups with participants over 60 years of age in both normal participants and participants with vascular-altered conditions. These results suggest that pathologic retinal vascular changes occurring in systemic vascular diseases are different form the changes in spontaneous ageing process, and the ageing process observed in retinal fundus images may saturate at age about 60 years. Implications of all available evidence. Based on this study and previous reports, the CNN could accurately and reliably predict age and sex using retinal fundus images. The fact that retinal changes caused by ageing and systemic vascular diseases occur differently motivates one to understand the retina deeper. Deep learning-based fundus image reading may be a more useful and beneficial tool for screening and diagnosing systemic and ocular diseases after further development.

Subject(s)

Diabetes Mellitus/epidemiology , Fundus Oculi , Hypertension/epidemiology , Retina/diagnostic imaging , Smoking/epidemiology , Adult , Aged , Algorithms , Area Under Curve , Diabetes Mellitus/pathology , Female , Humans , Hypertension/pathology , Image Processing, Computer-Assisted/methods , Male , Middle Aged , Neural Networks, Computer , Public Health Surveillance , ROC Curve , Republic of Korea , Retina/pathology

15.

Ruling out rotator cuff tear in shoulder radiograph series using deep learning: redefining the role of conventional radiograph.

Kim, Youngjune; Choi, Dongjun; Lee, Kyong Joon; Kang, Yusuhn; Ahn, Joong Mo; Lee, Eugene; Lee, Joon Woo; Kang, Heung Sik.

Eur Radiol ; 30(5): 2843-2852, 2020 May.

Article in English | MEDLINE | ID: mdl-32025834

ABSTRACT

OBJECTIVE: To develop a deep learning algorithm that can rule out significant rotator cuff tear based on conventional shoulder radiographs in patients suspected of rotator cuff tear. METHODS: The algorithm was developed using 6793 shoulder radiograph series performed between January 2015 and June 2018, which were labeled based on ultrasound or MRI conducted within 90 days, and clinical information (age, sex, dominant side, history of trauma, degree of pain). The output was the probability of significant rotator cuff tear (supraspinatus/infraspinatus complex tear with > 50% of tendon thickness). An operating point corresponding to sensitivity of 98% was set to achieve high negative predictive value (NPV) and low negative likelihood ratio (LR-). The performance of the algorithm was tested with 1095 radiograph series performed between July and December 2018. Subgroup analysis using Fisher's exact test was performed to identify factors (clinical information, radiography vendor, advanced imaging modality) associated with negative test results and NPV. RESULTS: Sensitivity, NPV, and LR- were 97.3%, 96.6%, and 0.06, respectively. The deep learning algorithm could rule out significant rotator cuff tear in about 30% of patients suspected of rotator cuff tear. The subgroup analysis showed that age < 60 years (p < 0.001), non-dominant side (p < 0.001), absence of trauma history (p = 0.001), and ultrasound examination (p < 0.001) were associated with negative test results. NPVs were higher in patients with age < 60 years (p = 0.024) and examined with ultrasound (p < 0.001). CONCLUSION: The deep learning algorithm could accurately rule out significant rotator cuff tear based on shoulder radiographs. KEY POINTS: â¢ The deep learning algorithm can rule out significant rotator cuff tear with a negative likelihood ratio of 0.06 and a negative predictive value of 96.6%. â¢ The deep learning algorithm can guide patients with significant rotator cuff tear to additional shoulder ultrasound or MRI with a sensitivity of 97.3%. â¢ The deep learning algorithm could rule out significant rotator cuff tear in about 30% of patients with clinically suspected rotator cuff tear.

Subject(s)

Deep Learning , Radiographic Image Interpretation, Computer-Assisted/methods , Radiography/methods , Rotator Cuff Injuries/diagnostic imaging , Female , Humans , Male , Middle Aged , Reproducibility of Results , Retrospective Studies , Rotator Cuff/diagnostic imaging , Sensitivity and Specificity

16.

Performance of a Deep Learning Algorithm in Detecting Osteonecrosis of the Femoral Head on Digital Radiography: A Comparison With Assessments by Radiologists.

Chee, Choong Guen; Kim, Youngjune; Kang, Yusuhn; Lee, Kyong Joon; Chae, Hee-Dong; Cho, Jungheum; Nam, Chang-Mo; Choi, Dongjun; Lee, Eugene; Lee, Joon Woo; Hong, Sung Hwan; Ahn, Joong Mo; Kang, Heung Sik.

AJR Am J Roentgenol ; 213(1): 155-162, 2019 Jul.

Article in English | MEDLINE | ID: mdl-30917021

ABSTRACT

OBJECTIVE. The objective of our study was to compare the sensitivity of a deep learning (DL) algorithm with the assessments by radiologists in diagnosing osteonecrosis of the femoral head (ONFH) using digital radiography. MATERIALS AND METHODS. We performed a two-center, retrospective, noninferiority study of consecutive patients (≥ 16 years old) with a diagnosis of ONFH based on MR images. We investigated the following four datasets of unilaterally cropped hip anteroposterior radiographs: training (n = 1346), internal validation (n = 148), temporal external test (n = 148), and geographic external test (n = 250). Diagnostic performance was measured for a DL algorithm, a less experienced radiologist, and an experienced radiologist. Noninferiority analyses for sensitivity were performed for the DL algorithm and both radiologists. Subgroup analysis for precollapse and postcollapse ONFH was done. RESULTS. Overall, 1892 hips (1037 diseased and 855 normal) were included. Sensitivity and specificity for the temporal external test set were 84.8% and 91.3% for the DL algorithm, 77.6% and 100.0% for the less experienced radiologist, and 82.4% and 100.0% for the experienced radiologist. Sensitivity and specificity for the geographic external test set were 75.2% and 97.2% for the DL algorithm, 77.6% and 75.0% for the less experienced radiologist, and 78.0% and 86.1% for the experienced radiologist. The sensitivity of the DL algorithm was noninferior to that of the assessments by both radiologists. The DL algorithm was more sensitive for precollapse ONFH than the assessment by the less experienced radiologist in the temporal external test set (75.9% vs 57.4%; 95% CI of the difference, 4.5-32.8%). CONCLUSION. The sensitivity of the DL algorithm for diagnosing ONFH using digital radiography was noninferior to that of both less experienced and experienced radiologist assessments.

17.

Machine learning for detecting moyamoya disease in plain skull radiography using a convolutional neural network.

Kim, Tackeun; Heo, Jaehyuk; Jang, Dong-Kyu; Sunwoo, Leonard; Kim, Joonghee; Lee, Kyong Joon; Kang, Si-Hyuck; Park, Sang Jun; Kwon, O-Ki; Oh, Chang Wan.

EBioMedicine ; 40: 636-642, 2019 Feb.

Article in English | MEDLINE | ID: mdl-30598372

ABSTRACT

BACKGROUND: Recently, innovative attempts have been made to identify moyamoya disease (MMD) by focusing on the morphological differences in the head of MMD patients. Following the recent revolution in the development of deep learning (DL) algorithms, we designed this study to determine whether DL can distinguish MMD in plain skull radiograph images. METHODS: Three hundred forty-five skull images were collected as an MMD-labeled dataset from patients aged 18 to 50â¯years with definite MMD. As a control-labeled data set, 408 skull images of trauma patients were selected by age and sex matching. Skull images were partitioned into training and test datasets at a 7:3 ratio using permutation. A total of six convolution layers were designed and trained. The accuracy and area under the receiver operating characteristic (AUROC) curve were evaluated as classifier performance. To identify areas of attention, gradient-weighted class activation mapping was applied. External validation was performed with a new dataset from another hospital. FINDINGS: For the institutional test set, the classifier predicted the true label with 84·1% accuracy. Sensitivity and specificity were both 0·84. AUROC was 0·91. MMD was predicted by attention to the lower face in most cases. Overall accuracy for external validation data set was 75·9%. INTERPRETATION: DL can distinguish MMD cases within specific ages from controls in plain skull radiograph images with considerable accuracy and AUROC. The viscerocranium may play a role in MMD-related skull features. FUND: This work was supported by grant no. 18-2018-029 from the Seoul National University Bundang Hospital Research Fund.

Subject(s)

Machine Learning , Moyamoya Disease/diagnosis , Neural Networks, Computer , Radiography , Skull/diagnostic imaging , Adult , Algorithms , Data Interpretation, Statistical , Female , Humans , Image Processing, Computer-Assisted , Male , Middle Aged , ROC Curve , Radiography/methods , Reproducibility of Results , Young Adult

18.

Deep Learning in Diagnosis of Maxillary Sinusitis Using Conventional Radiography.

Kim, Youngjune; Lee, Kyong Joon; Sunwoo, Leonard; Choi, Dongjun; Nam, Chang-Mo; Cho, Jungheum; Kim, Jihyun; Bae, Yun Jung; Yoo, Roh-Eul; Choi, Byung Se; Jung, Cheolkyu; Kim, Jae Hyoung.

Invest Radiol ; 54(1): 7-15, 2019 01.

Article in English | MEDLINE | ID: mdl-30067607

ABSTRACT

OBJECTIVES: The aim of this study was to compare the diagnostic performance of a deep learning algorithm with that of radiologists in diagnosing maxillary sinusitis on Waters' view radiographs. MATERIALS AND METHODS: Among 80,475 Waters' view radiographs, examined between May 2003 and February 2017, 9000 randomly selected cases were classified as normal or maxillary sinusitis based on radiographic findings and divided into training (n = 8000) and validation (n = 1000) sets to develop a deep learning algorithm. Two test sets composed of Waters' view radiographs with concurrent paranasal sinus computed tomography were labeled based on computed tomography findings: one with temporal separation (n = 140) and the other with geographic separation (n = 200) from the training set. Area under the receiver operating characteristics curve (AUC), sensitivity, and specificity of the algorithm and 5 radiologists were assessed. Interobserver agreement between the algorithm and majority decision of the radiologists was measured. The correlation coefficient between the predicted probability of the algorithm and average confidence level of the radiologists was determined. RESULTS: The AUCs of the deep learning algorithm were 0.93 and 0.88 for the temporal and geographic external test sets, respectively. The AUCs of the radiologists were 0.83 to 0.89 for the temporal and 0.75 to 0.84 for the geographic external test sets. The deep learning algorithm showed statistically significantly higher AUC than radiologist in both test sets. In terms of sensitivity and specificity, the deep learning algorithm was comparable to the radiologists. A strong interobserver agreement was noted between the algorithm and radiologists (Cohen κ coefficient, 0.82). The correlation coefficient between the predicted probability of the algorithm and confidence level of radiologists was 0.89 and 0.84 for the 2 test sets, respectively. CONCLUSIONS: The deep learning algorithm could diagnose maxillary sinusitis on Waters' view radiograph with superior AUC and comparable sensitivity and specificity to those of radiologists.

Subject(s)

Deep Learning , Maxillary Sinusitis/diagnostic imaging , Radiography/methods , Area Under Curve , Female , Humans , Male , Maxillary Sinus/diagnostic imaging , Middle Aged , ROC Curve , Sensitivity and Specificity , Tomography, X-Ray Computed/methods

19.

Development of an algorithm to automatically compress a CT image to visually lossless threshold.

Nam, Chang-Mo; Lee, Kyong Joon; Ko, Yousun; Kim, Kil Joong; Kim, Bohyoung; Lee, Kyoung Ho.

BMC Med Imaging ; 18(1): 53, 2018 12 17.

Article in English | MEDLINE | ID: mdl-30558555

ABSTRACT

BACKGROUND: To develop an algorithm to predict the visually lossless thresholds (VLTs) of CT images solely using the original images by exploiting the image features and DICOM header information for JPEG2000 compression and to evaluate the algorithm in comparison with pre-existing image fidelity metrics. METHODS: Five radiologists independently determined the VLT for 206 body CT images for JPEG2000 compression using QUEST procedure. The images were divided into training (n = 103) and testing (n = 103) sets. Using the training set, a multiple linear regression (MLR) model was constructed regarding the image features and DICOM header information as independent variables and regarding the VLTs determined with median value of the radiologists' responses (VLTrad) as dependent variable, after determining an optimal subset of independent variables by backward stepwise selection in a cross-validation scheme. The performance was evaluated on the testing set by measuring absolute differences and intra-class correlation (ICC) coefficient between the VLTrad and the VLTs predicted by the model (VLTmodel). The performance of the model was also compared two metrics, peak signal-to-noise ratio (PSNR) and high-dynamic range visual difference predictor (HDRVDP). The time for computing VLTs between MLR model, PSNR, and HDRVDP were compared using the repeated ANOVA with a post-hoc analysis. P < 0.05 was considered to indicate a statistically significant difference. RESULTS: The means of absolute differences with the VLTrad were 0.58 (95% CI, 0.48, 0.67), 0.73 (0.61, 0.85), and 0.68 (0.58, 0.79), for the MLR model, PSNR, and HDRVDP, respectively, showing significant difference between them (p < 0.01). The ICC coefficients of MLR model, PSNR, and HDRVDP were 0.88 (95% CI, 0.81, 0.95), 0.85 (0.79, 0.91), and 0.84 (0.77, 0.91). The computing times for calculating VLT per image were 1.5 ± 0.1 s, 3.9 ± 0.3 s, and 68.2 ± 1.4 s, for MLR metric, PSNR, and HDRVDP, respectively. CONCLUSIONS: The proposed MLR model directly predicting the VLT of a given CT image showed competitive performance to those of image fidelity metrics with less computational expenses. The model would be promising to be used for adaptive compression of CT images.

Subject(s)

Algorithms , Data Compression/methods , Tomography, X-Ray Computed , Adult , Humans , Linear Models , Radiographic Image Interpretation, Computer-Assisted/methods , Radiography, Abdominal/methods , Reproducibility of Results , Signal-To-Noise Ratio

20.

Limited detection of small (≤ 10 mm) colorectal liver metastasis at preoperative CT in patients undergoing liver resection.

Ko, Yousun; Kim, Jihang; Park, Joseph Kyu-Hyung; Kim, Haeryoung; Cho, Jai Young; Kang, Sung-Bum; Ahn, Soyeon; Lee, Kyong Joon; Lee, Kyoung Ho.

PLoS One ; 12(12): e0189797, 2017.

Article in English | MEDLINE | ID: mdl-29244853

ABSTRACT

OBJECTIVE: To retrospectively determine the sensitivity of preoperative CT in the detection of small (≤ 10 mm) colorectal liver metastasis (CRLM) nodules in patients undergoing liver resection. METHODS: The institutional review board approved the study and waived informed consent. We included 461 pathologically confirmed CRLM nodules in 211 patients (including 71 women; mean age, 66.4 years) who underwent 229 liver resections following abdominal CT. Prior to 163 resections, gadoxetic acid-enhanced liver MR imaging was also performed. Nodules were matched between pathology reports and prospective CT reports following a predefined algorithm. Per-nodule sensitivity of CT was calculated by nodule-size category. Generalized estimating equations were used to adjust for within-case correlation. RESULTS: Fourteen nodule sizes were missing in the pathology report. Nodules of 1-5 mm and 6-10 mm accounted for 8.1% (n = 36) and 23.5% (n = 105) of the remaining 447 nodules, and the number of nodules gradually decreased as nodule size increased beyond 10 mm. The overall sensitivity of CT was 81.2% (95% confidence interval, 77.1%, 85.2%; 365/461). The sensitivity was 8% (0%, 17%; 3/36), 55% (45%, 65%; 59/105), 91%, 95%, and 100% for nodules of 1-5 mm, 6-10 mm, 11-15 mm, 16-20 mm, and >20 mm, respectively. The nodule-size distribution was similar between resections undergoing gadoxetic acid-enhanced MR imaging and those not undergoing the MR imaging. CONCLUSION: CT has limited sensitivity for nodules of ≤ 10 mm and particularly of ≤ 5 mm.

Subject(s)

Colorectal Neoplasms/diagnostic imaging , Liver Neoplasms/diagnostic imaging , Liver/diagnostic imaging , Neoplasm Metastasis/diagnostic imaging , Aged , Colorectal Neoplasms/pathology , Colorectal Neoplasms/surgery , Contrast Media/therapeutic use , Female , Gadolinium DTPA , Hepatectomy , Humans , Liver/pathology , Liver/surgery , Liver Neoplasms/pathology , Liver Neoplasms/secondary , Liver Neoplasms/surgery , Magnetic Resonance Imaging , Male , Middle Aged , Neoplasm Metastasis/pathology , Preoperative Period

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL