Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 29
Filter
1.
Korean J Radiol ; 24(11): 1151-1163, 2023 11.
Article in English | MEDLINE | ID: mdl-37899524

ABSTRACT

OBJECTIVE: To develop a deep-learning-based bone age prediction model optimized for Korean children and adolescents and evaluate its feasibility by comparing it with a Greulich-Pyle-based deep-learning model. MATERIALS AND METHODS: A convolutional neural network was trained to predict age according to the bone development shown on a hand radiograph (bone age) using 21036 hand radiographs of Korean children and adolescents without known bone development-affecting diseases/conditions obtained between 1998 and 2019 (median age [interquartile range {IQR}], 9 [7-12] years; male:female, 11794:9242) and their chronological ages as labels (Korean model). We constructed 2 separate external datasets consisting of Korean children and adolescents with healthy bone development (Institution 1: n = 343; median age [IQR], 10 [4-15] years; male: female, 183:160; Institution 2: n = 321; median age [IQR], 9 [5-14] years; male: female, 164:157) to test the model performance. The mean absolute error (MAE), root mean square error (RMSE), and proportions of bone age predictions within 6, 12, 18, and 24 months of the reference age (chronological age) were compared between the Korean model and a commercial model (VUNO Med-BoneAge version 1.1; VUNO) trained with Greulich-Pyle-based age as the label (GP-based model). RESULTS: Compared with the GP-based model, the Korean model showed a lower RMSE (11.2 vs. 13.8 months; P = 0.004) and MAE (8.2 vs. 10.5 months; P = 0.002), a higher proportion of bone age predictions within 18 months of chronological age (88.3% vs. 82.2%; P = 0.031) for Institution 1, and a lower MAE (9.5 vs. 11.0 months; P = 0.022) and higher proportion of bone age predictions within 6 months (44.5% vs. 36.4%; P = 0.044) for Institution 2. CONCLUSION: The Korean model trained using the chronological ages of Korean children and adolescents without known bone development-affecting diseases/conditions as labels performed better in bone age assessment than the GP-based model in the Korean pediatric population. Further validation is required to confirm its accuracy.


Subject(s)
Artificial Intelligence , Deep Learning , Adolescent , Humans , Child , Male , Female , Infant , Age Determination by Skeleton , Radiography , Republic of Korea
2.
Korean J Radiol ; 24(10): 1038-1041, 2023 Oct.
Article in English | MEDLINE | ID: mdl-37793672
3.
Sci Rep ; 13(1): 5934, 2023 04 12.
Article in English | MEDLINE | ID: mdl-37045856

ABSTRACT

The identification of abnormal findings manifested in retinal fundus images and diagnosis of ophthalmic diseases are essential to the management of potentially vision-threatening eye conditions. Recently, deep learning-based computer-aided diagnosis systems (CADs) have demonstrated their potential to reduce reading time and discrepancy amongst readers. However, the obscure reasoning of deep neural networks (DNNs) has been the leading cause to reluctance in its clinical use as CAD systems. Here, we present a novel architectural and algorithmic design of DNNs to comprehensively identify 15 abnormal retinal findings and diagnose 8 major ophthalmic diseases from macula-centered fundus images with the accuracy comparable to experts. We then define a notion of counterfactual attribution ratio (CAR) which luminates the system's diagnostic reasoning, representing how each abnormal finding contributed to its diagnostic prediction. By using CAR, we show that both quantitative and qualitative interpretation and interactive adjustment of the CAD result can be achieved. A comparison of the model's CAR with experts' finding-disease diagnosis correlation confirms that the proposed model identifies the relationship between findings and diseases similarly as ophthalmologists do.


Subject(s)
Deep Learning , Eye Diseases , Humans , Algorithms , Neural Networks, Computer , Fundus Oculi , Retina/diagnostic imaging
4.
Endocrinol Metab (Seoul) ; 37(4): 674-683, 2022 08.
Article in English | MEDLINE | ID: mdl-35927066

ABSTRACT

BACKGRUOUND: Since image-based fracture prediction models using deep learning are lacking, we aimed to develop an X-ray-based fracture prediction model using deep learning with longitudinal data. METHODS: This study included 1,595 participants aged 50 to 75 years with at least two lumbosacral radiographs without baseline fractures from 2010 to 2015 at Seoul National University Hospital. Positive and negative cases were defined according to whether vertebral fractures developed during follow-up. The cases were divided into training (n=1,416) and test (n=179) sets. A convolutional neural network (CNN)-based prediction algorithm, DeepSurv, was trained with images and baseline clinical information (age, sex, body mass index, glucocorticoid use, and secondary osteoporosis). The concordance index (C-index) was used to compare performance between DeepSurv and the Fracture Risk Assessment Tool (FRAX) and Cox proportional hazard (CoxPH) models. RESULTS: Of the total participants, 1,188 (74.4%) were women, and the mean age was 60.5 years. During a mean follow-up period of 40.7 months, vertebral fractures occurred in 7.5% (120/1,595) of participants. In the test set, when DeepSurv learned with images and clinical features, it showed higher performance than FRAX and CoxPH in terms of C-index values (DeepSurv, 0.612; 95% confidence interval [CI], 0.571 to 0.653; FRAX, 0.547; CoxPH, 0.594; 95% CI, 0.552 to 0.555). Notably, the DeepSurv method without clinical features had a higher C-index (0.614; 95% CI, 0.572 to 0.656) than that of FRAX in women. CONCLUSION: DeepSurv, a CNN-based prediction algorithm using baseline image and clinical information, outperformed the FRAX and CoxPH models in predicting osteoporotic fracture from spine radiographs in a longitudinal cohort.


Subject(s)
Deep Learning , Osteoporotic Fractures , Spinal Fractures , Algorithms , Bone Density , Female , Humans , Male , Middle Aged , Osteoporotic Fractures/diagnostic imaging , Osteoporotic Fractures/epidemiology , Spinal Fractures/diagnostic imaging , Spinal Fractures/epidemiology , X-Rays
5.
J Digit Imaging ; 35(4): 1061-1068, 2022 08.
Article in English | MEDLINE | ID: mdl-35304676

ABSTRACT

Algorithms that automatically identify nodular patterns in chest X-ray (CXR) images could benefit radiologists by reducing reading time and improving accuracy. A promising approach is to use deep learning, where a deep neural network (DNN) is trained to classify and localize nodular patterns (including mass) in CXR images. Such algorithms, however, require enough abnormal cases to learn representations of nodular patterns arising in practical clinical settings. Obtaining large amounts of high-quality data is impractical in medical imaging where (1) acquiring labeled images is extremely expensive, (2) annotations are subject to inaccuracies due to the inherent difficulty in interpreting images, and (3) normal cases occur far more frequently than abnormal cases. In this work, we devise a framework to generate realistic nodules and demonstrate how they can be used to train a DNN identify and localize nodular patterns in CXR images. While most previous research applying generative models to medical imaging are limited to generating visually plausible abnormalities and using these patterns for augmentation, we go a step further to show how the training algorithm can be adjusted accordingly to maximally benefit from synthetic abnormal patterns. A high-precision detection model was first developed and tested on internal and external datasets, and the proposed method was shown to enhance the model's recall while retaining the low level of false positives.


Subject(s)
Neural Networks, Computer , Radiography, Thoracic , Algorithms , Humans , Radiographic Image Interpretation, Computer-Assisted/methods , Radiography , Radiography, Thoracic/methods
6.
Eur Radiol ; 31(12): 8947-8955, 2021 Dec.
Article in English | MEDLINE | ID: mdl-34115194

ABSTRACT

OBJECTIVES: Bone age is considered an indicator for the diagnosis of precocious or delayed puberty and a predictor of adult height. We aimed to evaluate the performance of a deep neural network model in assessing rapidly advancing bone age during puberty using elbow radiographs. METHODS: In all, 4437 anteroposterior and lateral pairs of elbow radiographs were obtained from pubertal individuals from two institutions to implement and validate a deep neural network model. The reference standard bone age was established by five trained researchers using the Sauvegrain method, a scoring system based on the shapes of the lateral condyle, trochlea, olecranon apophysis, and proximal radial epiphysis. A test set (n = 141) was obtained from an external institution. The differences between the assessment of the model and that of reviewers were compared. RESULTS: The mean absolute difference (MAD) in bone age estimation between the model and reviewers was 0.15 years on internal validation. In the test set, the MAD between the model and the five experts ranged from 0.19 to 0.30 years. Compared with the reference standard, the MAD was 0.22 years. Interobserver agreement was excellent among reviewers (ICC: 0.99) and between the model and the reviewers (ICC: 0.98). In the subpart analysis, the olecranon apophysis exhibited the highest accuracy (74.5%), followed by the trochlea (73.7%), lateral condyle (73.7%), and radial epiphysis (63.1%). CONCLUSIONS: Assessment of rapidly advancing bone age during puberty on elbow radiographs using our deep neural network model was similar to that of experts. KEY POINTS: • Bone age during puberty is particularly important for patients with scoliosis or limb-length discrepancy to determine the phase of the disease, which influences the timing and method of surgery. • The commonly used hand radiographs-based methods have limitations in assessing bone age during puberty due to the less prominent morphological changes of the hand and wrist bones in this period. • A deep neural network model trained with elbow radiographs exhibited similar performance to human experts on estimating rapidly advancing bone age during puberty.


Subject(s)
Age Determination by Skeleton , Elbow , Adult , Elbow/diagnostic imaging , Humans , Infant , Neural Networks, Computer , Puberty , Radiography
7.
Radiology ; 299(2): 450-459, 2021 05.
Article in English | MEDLINE | ID: mdl-33754828

ABSTRACT

Background Previous studies assessing the effects of computer-aided detection on observer performance in the reading of chest radiographs used a sequential reading design that may have biased the results because of reading order or recall bias. Purpose To compare observer performance in detecting and localizing major abnormal findings including nodules, consolidation, interstitial opacity, pleural effusion, and pneumothorax on chest radiographs without versus with deep learning-based detection (DLD) system assistance in a randomized crossover design. Materials and Methods This study included retrospectively collected normal and abnormal chest radiographs between January 2016 and December 2017 (https://cris.nih.go.kr/; registration no. KCT0004147). The radiographs were randomized into two groups, and six observers, including thoracic radiologists, interpreted each radiograph without and with use of a commercially available DLD system by using a crossover design with a washout period. Jackknife alternative free-response receiver operating characteristic (JAFROC) figure of merit (FOM), area under the receiver operating characteristic curve (AUC), sensitivity, specificity, false-positive findings per image, and reading times of observers with and without the DLD system were compared by using McNemar and paired t tests. Results A total of 114 normal (mean patient age ± standard deviation, 51 years ± 11; 58 men) and 114 abnormal (mean patient age, 60 years ± 15; 75 men) chest radiographs were evaluated. The radiographs were randomized to two groups: group A (n = 114) and group B (n = 114). Use of the DLD system improved the observers' JAFROC FOM (from 0.90 to 0.95, P = .002), AUC (from 0.93 to 0.98, P = .002), per-lesion sensitivity (from 83% [822 of 990 lesions] to 89.1% [882 of 990 lesions], P = .009), per-image sensitivity (from 80% [548 of 684 radiographs] to 89% [608 of 684 radiographs], P = .009), and specificity (from 89.3% [611 of 684 radiographs] to 96.6% [661 of 684 radiographs], P = .01) and reduced the reading time (from 10-65 seconds to 6-27 seconds, P < .001). The DLD system alone outperformed the pooled observers (JAFROC FOM: 0.96 vs 0.90, respectively, P = .007; AUC: 0.98 vs 0.93, P = .003). Conclusion Observers including thoracic radiologists showed improved performance in the detection and localization of major abnormal findings on chest radiographs and reduced reading time with use of a deep learning-based detection system. © RSNA, 2021 Online supplemental material is available for this article.


Subject(s)
Deep Learning , Lung Diseases/diagnostic imaging , Radiography, Thoracic/methods , Cross-Over Studies , Female , Humans , Male , Middle Aged , Observer Variation , Republic of Korea , Retrospective Studies , Sensitivity and Specificity
8.
Sci Rep ; 11(1): 2876, 2021 02 03.
Article in English | MEDLINE | ID: mdl-33536550

ABSTRACT

There have been substantial efforts in using deep learning (DL) to diagnose cancer from digital images of pathology slides. Existing algorithms typically operate by training deep neural networks either specialized in specific cohorts or an aggregate of all cohorts when there are only a few images available for the target cohort. A trade-off between decreasing the number of models and their cancer detection performance was evident in our experiments with The Cancer Genomic Atlas dataset, with the former approach achieving higher performance at the cost of having to acquire large datasets from the cohort of interest. Constructing annotated datasets for individual cohorts is extremely time-consuming, with the acquisition cost of such datasets growing linearly with the number of cohorts. Another issue associated with developing cohort-specific models is the difficulty of maintenance: all cohort-specific models may need to be adjusted when a new DL algorithm is to be used, where training even a single model may require a non-negligible amount of computation, or when more data is added to some cohorts. In resolving the sub-optimal behavior of a universal cancer detection model trained on an aggregate of cohorts, we investigated how cohorts can be grouped to augment a dataset without increasing the number of models linearly with the number of cohorts. This study introduces several metrics which measure the morphological similarities between cohort pairs and demonstrates how the metrics can be used to control the trade-off between performance and the number of models.


Subject(s)
Datasets as Topic , Deep Learning , Image Processing, Computer-Assisted/methods , Neoplasms/diagnosis , Cohort Studies , Humans , Neoplasms/pathology
9.
Radiology ; 299(1): 211-219, 2021 04.
Article in English | MEDLINE | ID: mdl-33560190

ABSTRACT

Background Studies on the optimal CT section thickness for detecting subsolid nodules (SSNs) with computer-aided detection (CAD) are lacking. Purpose To assess the effect of CT section thickness on CAD performance in the detection of SSNs and to investigate whether deep learning-based super-resolution algorithms for reducing CT section thickness can improve performance. Materials and Methods CT images obtained with 1-, 3-, and 5-mm-thick sections were obtained in patients who underwent surgery between March 2018 and December 2018. Patients with resected synchronous SSNs and those without SSNs (negative controls) were retrospectively evaluated. The SSNs, which ranged from 6 to 30 mm, were labeled ground-truth lesions. A deep learning-based CAD system was applied to SSN detection on CT images of each section thickness and those converted from 3- and 5-mm section thickness into 1-mm section thickness by using the super-resolution algorithm. The CAD performance on each section thickness was evaluated and compared by using the jackknife alternative free response receiver operating characteristic figure of merit. Results A total of 308 patients (mean age ± standard deviation, 62 years ± 10; 183 women) with 424 SSNs (310 part-solid and 114 nonsolid nodules) and 182 patients without SSNs (mean age, 65 years ± 10; 97 men) were evaluated. The figures of merit differed across the three section thicknesses (0.92, 0.90, and 0.89 for 1, 3, and 5 mm, respectively; P = .04) and between 1- and 5-mm sections (P = .04). The figures of merit varied for nonsolid nodules (0.78, 0.72, and 0.66 for 1, 3, and 5 mm, respectively; P < .001) but not for part-solid nodules (range, 0.93-0.94; P = .76). The super-resolution algorithm improved CAD sensitivity on 3- and 5-mm-thick sections (P = .02 for 3 mm, P < .001 for 5 mm). Conclusion Computer-aided detection (CAD) of subsolid nodules performed better at 1-mm section thickness CT than at 3- and 5-mm section thickness CT, particularly with nonsolid nodules. Application of a super-resolution algorithm improved the sensitivity of CAD at 3- and 5-mm section thickness CT. © RSNA, 2021 Online supplemental material is available for this article. See also the editorial by Goo in this issue.


Subject(s)
Deep Learning , Diagnosis, Computer-Assisted/methods , Lung Neoplasms/diagnostic imaging , Multiple Pulmonary Nodules/diagnostic imaging , Tomography, X-Ray Computed/methods , Aged , Female , Humans , Male , Middle Aged , Radiographic Image Interpretation, Computer-Assisted/methods , Retrospective Studies
10.
Clin Cancer Res ; 27(3): 719-728, 2021 02 01.
Article in English | MEDLINE | ID: mdl-33172897

ABSTRACT

PURPOSE: Gastric cancer remains the leading cause of cancer-related deaths in Northeast Asia. Population-based endoscopic screenings in the region have yielded successful results in early detection of gastric tumors. Endoscopic screening rates are continuously increasing, and there is a need for an automatic computerized diagnostic system to reduce the diagnostic burden. In this study, we developed an algorithm to classify gastric epithelial tumors automatically and assessed its performance in a large series of gastric biopsies and its benefits as an assistance tool. EXPERIMENTAL DESIGN: Using 2,434 whole-slide images, we developed an algorithm based on convolutional neural networks to classify a gastric biopsy image into one of three categories: negative for dysplasia (NFD), tubular adenoma, or carcinoma. The performance of the algorithm was evaluated by using 7,440 biopsy specimens collected prospectively. The impact of algorithm-assisted diagnosis was assessed by six pathologists using 150 gastric biopsy cases. RESULTS: Diagnostic performance evaluated by the AUROC curve in the prospective study was 0.9790 for two-tier classification: negative (NFD) versus positive (all cases except NFD). When limited to epithelial tumors, the sensitivity and specificity were 1.000 and 0.9749. Algorithm-assisted digital image viewer (DV) resulted in 47% reduction in review time per image compared with DV only and 58% decrease to microscopy. CONCLUSIONS: Our algorithm has demonstrated high accuracy in classifying epithelial tumors and its benefits as an assistance tool, which can serve as a potential screening aid system in diagnosing gastric biopsy specimens.


Subject(s)
Deep Learning , Gastric Mucosa/pathology , Image Interpretation, Computer-Assisted/methods , Pathologists/statistics & numerical data , Stomach Neoplasms/diagnosis , Adult , Aged , Aged, 80 and over , Biopsy/statistics & numerical data , Feasibility Studies , Female , Gastric Mucosa/diagnostic imaging , Gastroscopy/statistics & numerical data , Humans , Image Interpretation, Computer-Assisted/statistics & numerical data , Male , Middle Aged , Observer Variation , Prospective Studies , Retrospective Studies , Sensitivity and Specificity , Stomach Neoplasms/pathology
12.
Transl Vis Sci Technol ; 9(6): 28, 2020 11.
Article in English | MEDLINE | ID: mdl-33184590

ABSTRACT

Purpose: To evaluate high accumulation of coronary artery calcium (CAC) from retinal fundus images with deep learning technologies as an inexpensive and radiation-free screening method. Methods: Individuals who underwent bilateral retinal fundus imaging and CAC score (CACS) evaluation from coronary computed tomography scans on the same day were identified. With this database, performances of deep learning algorithms (inception-v3) to distinguish high CACS from CACS of 0 were evaluated at various thresholds for high CACS. Vessel-inpainted and fovea-inpainted images were also used as input to investigate areas of interest in determining CACS. Results: A total of 44,184 images from 20,130 individuals were included. A deep learning algorithm for discrimination of no CAC from CACS >100 achieved area under receiver operating curve (AUROC) of 82.3% (79.5%-85.0%) and 83.2% (80.2%-86.3%) using unilateral and bilateral fundus images, respectively, under a 5-fold cross validation setting. AUROC increased as the criterion for high CACS was increased, showing a plateau at 100 and losing significant improvement thereafter. AUROC decreased when fovea was inpainted and decreased further when vessels were inpainted, whereas AUROC increased when bilateral images were used as input. Conclusions: Visual patterns of retinal fundus images in subjects with CACS > 100 could be recognized by deep learning algorithms compared with those with no CAC. Exploiting bilateral images improves discrimination performance, and ablation studies removing retinal vasculature or fovea suggest that recognizable patterns reside mainly in these areas. Translational Relevance: Retinal fundus images can be used by deep learning algorithms for prediction of high CACS.


Subject(s)
Coronary Vessels , Deep Learning , Algorithms , Coronary Vessels/diagnostic imaging , Fundus Oculi , Humans , Tomography, X-Ray Computed
13.
J Korean Med Sci ; 35(42): e379, 2020 11 02.
Article in English | MEDLINE | ID: mdl-33140591

ABSTRACT

In recent years, artificial intelligence (AI) technologies have greatly advanced and become a reality in many areas of our daily lives. In the health care field, numerous efforts are being made to implement the AI technology for practical medical treatments. With the rapid developments in machine learning algorithms and improvements in hardware performances, the AI technology is expected to play an important role in effectively analyzing and utilizing extensive amounts of health and medical data. However, the AI technology has various unique characteristics that are different from the existing health care technologies. Subsequently, there are a number of areas that need to be supplemented within the current health care system for the AI to be utilized more effectively and frequently in health care. In addition, the number of medical practitioners and public that accept AI in the health care is still low; moreover, there are various concerns regarding the safety and reliability of AI technology implementations. Therefore, this paper aims to introduce the current research and application status of AI technology in health care and discuss the issues that need to be resolved.


Subject(s)
Artificial Intelligence , Delivery of Health Care , Government Regulation , Health Policy , Humans , Image Processing, Computer-Assisted , Magnetic Resonance Imaging , Safety Management , Tomography, X-Ray Computed
14.
Ophthalmology ; 127(1): 85-94, 2020 01.
Article in English | MEDLINE | ID: mdl-31281057

ABSTRACT

PURPOSE: To develop and evaluate deep learning models that screen multiple abnormal findings in retinal fundus images. DESIGN: Cross-sectional study. PARTICIPANTS: For the development and testing of deep learning models, 309 786 readings from 103 262 images were used. Two additional external datasets (the Indian Diabetic Retinopathy Image Dataset and e-ophtha) were used for testing. A third external dataset (Messidor) was used for comparison of the models with human experts. METHODS: Macula-centered retinal fundus images from the Seoul National University Bundang Hospital Retina Image Archive, obtained at the health screening center and ophthalmology outpatient clinic at Seoul National University Bundang Hospital, were assessed for 12 major findings (hemorrhage, hard exudate, cotton-wool patch, drusen, membrane, macular hole, myelinated nerve fiber, chorioretinal atrophy or scar, any vascular abnormality, retinal nerve fiber layer defect, glaucomatous disc change, and nonglaucomatous disc change) with their regional information using deep learning algorithms. MAIN OUTCOME MEASURES: Area under the receiver operating characteristic curve and sensitivity and specificity of the deep learning algorithms at the highest harmonic mean were evaluated and compared with the performance of retina specialists, and visualization of the lesions was qualitatively analyzed. RESULTS: Areas under the receiver operating characteristic curves for all findings were high at 96.2% to 99.9% when tested in the in-house dataset. Lesion heatmaps highlight salient regions effectively in various findings. Areas under the receiver operating characteristic curves for diabetic retinopathy-related findings tested in the Indian Diabetic Retinopathy Image Dataset and e-ophtha dataset were 94.7% to 98.0%. The model demonstrated a performance that rivaled that of human experts, especially in the detection of hemorrhage, hard exudate, membrane, macular hole, myelinated nerve fiber, and glaucomatous disc change. CONCLUSIONS: Our deep learning algorithms with region guidance showed reliable performance for detection of multiple findings in macula-centered retinal fundus images. These interpretable, as well as reliable, classification outputs open the possibility for clinical use as an automated screening system for retinal fundus images.


Subject(s)
Algorithms , Deep Learning , Image Interpretation, Computer-Assisted/methods , Retinal Diseases/diagnostic imaging , Adult , Aged , Aged, 80 and over , Area Under Curve , Cross-Sectional Studies , Datasets as Topic , Female , Fundus Oculi , Humans , Machine Learning , Male , Middle Aged , Neural Networks, Computer , ROC Curve , Sensitivity and Specificity
15.
Eur Radiol ; 30(3): 1359-1368, 2020 Mar.
Article in English | MEDLINE | ID: mdl-31748854

ABSTRACT

OBJECTIVE: To investigate the feasibility of a deep learning-based detection (DLD) system for multiclass lesions on chest radiograph, in comparison with observers. METHODS: A total of 15,809 chest radiographs were collected from two tertiary hospitals (7204 normal and 8605 abnormal with nodule/mass, interstitial opacity, pleural effusion, or pneumothorax). Except for the test set (100 normal and 100 abnormal (nodule/mass, 70; interstitial opacity, 10; pleural effusion, 10; pneumothorax, 10)), radiographs were used to develop a DLD system for detecting multiclass lesions. The diagnostic performance of the developed model and that of nine observers with varying experiences were evaluated and compared using area under the receiver operating characteristic curve (AUROC), on a per-image basis, and jackknife alternative free-response receiver operating characteristic figure of merit (FOM) on a per-lesion basis. The false-positive fraction was also calculated. RESULTS: Compared with the group-averaged observations, the DLD system demonstrated significantly higher performances on image-wise normal/abnormal classification and lesion-wise detection with pattern classification (AUROC, 0.985 vs. 0.958; p = 0.001; FOM, 0.962 vs. 0.886; p < 0.001). In lesion-wise detection, the DLD system outperformed all nine observers. In the subgroup analysis, the DLD system exhibited consistently better performance for both nodule/mass (FOM, 0.913 vs. 0.847; p < 0.001) and the other three abnormal classes (FOM, 0.995 vs. 0.843; p < 0.001). The false-positive fraction of all abnormalities was 0.11 for the DLD system and 0.19 for the observers. CONCLUSIONS: The DLD system showed the potential for detection of lesions and pattern classification on chest radiographs, performing normal/abnormal classifications and achieving high diagnostic performance. KEY POINTS: • The DLD system was feasible for detection with pattern classification of multiclass lesions on chest radiograph. • The DLD system had high performance of image-wise classification as normal or abnormal chest radiographs (AUROC, 0.985) and showed especially high specificity (99.0%). • In lesion-wise detection of multiclass lesions, the DLD system outperformed all 9 observers (FOM, 0.962 vs. 0.886; p < 0.001).


Subject(s)
Deep Learning , Lung Diseases/diagnostic imaging , Pleural Diseases/diagnostic imaging , Radiography, Thoracic/methods , Adult , Aged , Area Under Curve , Female , Humans , Lung Diseases, Interstitial/diagnostic imaging , Lung Neoplasms/diagnostic imaging , Male , Middle Aged , Pleural Effusion/diagnostic imaging , Pneumothorax/diagnostic imaging , ROC Curve , Radiography , Sensitivity and Specificity , Solitary Pulmonary Nodule/diagnostic imaging
16.
Sci Rep ; 9(1): 18738, 2019 12 10.
Article in English | MEDLINE | ID: mdl-31822774

ABSTRACT

To investigate the reproducibility of computer-aided detection (CAD) for detection of pulmonary nodules and masses for consecutive chest radiographies (CXRs) of the same patient within a short-term period. A total of 944 CXRs (Chest PA) with nodules and masses, recorded between January 2010 and November 2016 at the Asan Medical Center, were obtained. In all, 1092 regions of interest for the nodules and mass were delineated using an in-house software. All CXRs were randomly split into 6:2:2 sets for training, development, and validation. Furthermore, paired follow-up CXRs (n = 121) acquired within one week in the validation set, in which expert thoracic radiologists confirmed no changes, were used to evaluate the reproducibility of CAD by two radiologists (R1 and R2). The reproducibility comparison of four different convolutional neural net algorithms and two chest radiologists (with 13- and 14-years' experience) was conducted. Model performances were evaluated by figure-of-merit (FOM) analysis of the jackknife free-response receiver operating curve and reproducibility rates were evaluated in terms of percent positive agreement (PPA) and Chamberlain's percent positive agreement (CPPA). Reproducibility analysis of the four CADs and R1 and R2 showed variations in the PPA and CPPA. Model performance of YOLO (You Only Look Once) v2 based eDenseYOLO showed a higher FOM (0.89; 0.85-0.93) than RetinaNet (0.89; 0.85-0.93) and atrous spatial pyramid pooling U-Net (0.85; 0.80-0.89). eDenseYOLO showed higher PPAs (97.87%) and CPPAs (95.80%) than Mask R-CNN, RetinaNet, ASSP U-Net, R1, and R2 (PPA: 96.52%, 94.23%, 95.04%, 96.55%, and 94.98%; CPPA: 93.18%, 89.09%, 90.57%, 93.33%, and 90.43%). There were moderate variations in the reproducibility of CAD with different algorithms, which likely indicates that measurement of reproducibility is necessary for evaluating CAD performance in actual clinical environments.


Subject(s)
Radiographic Image Interpretation, Computer-Assisted/methods , Radiography, Thoracic/methods , Aged , Algorithms , Computers , Female , Humans , Image Processing, Computer-Assisted/methods , Lung Neoplasms/diagnostic imaging , Male , Middle Aged , Multiple Pulmonary Nodules/diagnostic imaging , Radiography/methods , Radiologists , Reproducibility of Results , Retrospective Studies , Sensitivity and Specificity , Software , Solitary Pulmonary Nodule/diagnostic imaging , Tomography, X-Ray Computed/methods
17.
Sci Rep ; 9(1): 17615, 2019 11 26.
Article in English | MEDLINE | ID: mdl-31772195

ABSTRACT

In this study, a deep learning-based method for developing an automated diagnostic support system that detects periodontal bone loss in the panoramic dental radiographs is proposed. The presented method called DeNTNet not only detects lesions but also provides the corresponding teeth numbers of the lesion according to dental federation notation. DeNTNet applies deep convolutional neural networks(CNNs) using transfer learning and clinical prior knowledge to overcome the morphological variation of the lesions and imbalanced training dataset. With 12,179 panoramic dental radiographs annotated by experienced dental clinicians, DeNTNet was trained, validated, and tested using 11,189, 190, and 800 panoramic dental radiographs, respectively. Each experimental model was subjected to comparative study to demonstrate the validity of each phase of the proposed method. When compared to the dental clinicians, DeNTNet achieved the F1 score of 0.75 on the test set, whereas the average performance of dental clinicians was 0.69.


Subject(s)
Alveolar Bone Loss/diagnostic imaging , Deep Learning , Radiographic Image Interpretation, Computer-Assisted/methods , Radiography, Panoramic , Algorithms , Datasets as Topic , Dental Hygienists , Humans , Observer Variation , Retrospective Studies
18.
Korean J Radiol ; 20(10): 1431-1440, 2019 10.
Article in English | MEDLINE | ID: mdl-31544368

ABSTRACT

OBJECTIVE: To retrospectively assess the effect of CT slice thickness on the reproducibility of radiomic features (RFs) of lung cancer, and to investigate whether convolutional neural network (CNN)-based super-resolution (SR) algorithms can improve the reproducibility of RFs obtained from images with different slice thicknesses. MATERIALS AND METHODS: CT images with 1-, 3-, and 5-mm slice thicknesses obtained from 100 pathologically proven lung cancers between July 2017 and December 2017 were evaluated. CNN-based SR algorithms using residual learning were developed to convert thick-slice images into 1-mm slices. Lung cancers were semi-automatically segmented and a total of 702 RFs (tumor intensity, texture, and wavelet features) were extracted from 1-, 3-, and 5-mm slices, as well as the 1-mm slices generated from the 3- and 5-mm images. The stabilities of the RFs were evaluated using concordance correlation coefficients (CCCs). RESULTS: The mean CCCs for the comparisons of original 1 mm vs. 3 mm, 1 mm vs. 5 mm, and 3 mm vs. 5 mm images were 0.41, 0.27, and 0.65, respectively (p < 0.001 for all comparisons). Tumor intensity features showed the best reproducibility while wavelets showed the lowest reproducibility. The majority of RFs failed to achieve reproducibility (CCC ≥ 0.85; 3.6%, 1.0%, and 21.5%, respectively). After applying the CNN-based SR algorithms, the reproducibility significantly improved in all three pairings (mean CCCs: 0.58, 0.45, and 0.72; p < 0.001 for all comparisons). The reproducible RFs also increased (36.3%, 17.4%, and 36.9%, respectively). CONCLUSION: The reproducibility of RFs in lung cancer is significantly influenced by CT slice thickness, which can be improved by the CNN-based SR algorithms.


Subject(s)
Deep Learning , Image Processing, Computer-Assisted/methods , Lung Neoplasms/diagnostic imaging , Neural Networks, Computer , Tomography, X-Ray Computed/methods , Algorithms , Female , Humans , Lung Neoplasms/diagnosis , Male , Middle Aged , Radiometry/methods , Reproducibility of Results , Retrospective Studies
19.
J Digit Imaging ; 32(3): 499-512, 2019 06.
Article in English | MEDLINE | ID: mdl-30291477

ABSTRACT

Automatic segmentation of the retinal vasculature and the optic disc is a crucial task for accurate geometric analysis and reliable automated diagnosis. In recent years, Convolutional Neural Networks (CNN) have shown outstanding performance compared to the conventional approaches in the segmentation tasks. In this paper, we experimentally measure the performance gain for Generative Adversarial Networks (GAN) framework when applied to the segmentation tasks. We show that GAN achieves statistically significant improvement in area under the receiver operating characteristic (AU-ROC) and area under the precision and recall curve (AU-PR) on two public datasets (DRIVE, STARE) by segmenting fine vessels. Also, we found a model that surpassed the current state-of-the-art method by 0.2 - 1.0% in AU-ROC and 0.8 - 1.2% in AU-PR and 0.5 - 0.7% in dice coefficient. In contrast, significant improvements were not observed in the optic disc segmentation task on DRIONS-DB, RIM-ONE (r3) and Drishti-GS datasets in AU-ROC and AU-PR.


Subject(s)
Image Processing, Computer-Assisted/methods , Neural Networks, Computer , Ophthalmoscopy , Optic Disk/diagnostic imaging , Pattern Recognition, Automated/methods , Retinal Vessels/diagnostic imaging , Humans
20.
J Korean Med Sci ; 33(43): e239, 2018 Oct 22.
Article in English | MEDLINE | ID: mdl-30344460

ABSTRACT

BACKGROUND: We described a novel multi-step retinal fundus image reading system for providing high-quality large data for machine learning algorithms, and assessed the grader variability in the large-scale dataset generated with this system. METHODS: A 5-step retinal fundus image reading tool was developed that rates image quality, presence of abnormality, findings with location information, diagnoses, and clinical significance. Each image was evaluated by 3 different graders. Agreements among graders for each decision were evaluated. RESULTS: The 234,242 readings of 79,458 images were collected from 55 licensed ophthalmologists during 6 months. The 34,364 images were graded as abnormal by at-least one rater. Of these, all three raters agreed in 46.6% in abnormality, while 69.9% of the images were rated as abnormal by two or more raters. Agreement rate of at-least two raters on a certain finding was 26.7%-65.2%, and complete agreement rate of all-three raters was 5.7%-43.3%. As for diagnoses, agreement of at-least two raters was 35.6%-65.6%, and complete agreement rate was 11.0%-40.0%. Agreement of findings and diagnoses were higher when restricted to images with prior complete agreement on abnormality. Retinal/glaucoma specialists showed higher agreements on findings and diagnoses of their corresponding subspecialties. CONCLUSION: This novel reading tool for retinal fundus images generated a large-scale dataset with high level of information, which can be utilized in future development of machine learning-based algorithms for automated identification of abnormal conditions and clinical decision supporting system. These results emphasize the importance of addressing grader variability in algorithm developments.


Subject(s)
Databases, Factual , Machine Learning , Retina/diagnostic imaging , Fundus Oculi , Humans , Republic of Korea
SELECTION OF CITATIONS
SEARCH DETAIL
...