Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 160
Filter
1.
J Pediatr Surg ; 2024 Jun 08.
Article in English | MEDLINE | ID: mdl-38955625

ABSTRACT

BACKGROUND: Radiographic diagnosis of necrotizing enterocolitis (NEC) is challenging. Deep learning models may improve accuracy by recognizing subtle imaging patterns. We hypothesized it would perform with comparable accuracy to that of senior surgical residents. METHODS: This cohort study compiled 494 anteroposterior neonatal abdominal radiographs (214 images NEC, 280 other) and randomly divided them into training, validation, and test sets. Transfer learning was utilized to fine-tune a ResNet-50 deep convolutional neural network (DCNN) pre-trained on ImageNet. Gradient-weighted Class Activation Mapping (Grad-CAM) heatmaps visualized image regions of greatest relevance to the pretrained neural network. Senior surgery residents at a single institution examined the test set. Resident and DCNN ability to identify pneumatosis on radiographic images were measured via area under the receiver operating curves (AUROC) and compared using DeLong's method. RESULTS: The pretrained neural network achieved AUROC of 0.918 (95% CI, 0.837-0.978) with an accuracy of 87.8% with five false negative and one false positive prediction. Heatmaps confirmed appropriate image region emphasis by the pretrained neural network. Senior surgical residents had a median area under the receiver operating curve of 0.896, ranging from 0.778 (95% CI 0.615-0.941) to 0.991 (95% CI 0.971-0.999) with zero to five false negatives and one to eleven false positive predictions. The deep convolutional neural network performed comparably to each surgical resident's performance (p > 0.05 for all comparisons). CONCLUSIONS: A deep convolutional neural network trained to recognize pneumatosis can quickly and accurately assist clinicians in promptly identifying NEC in clinical practice. LEVEL OF EVIDENCE: III (study type: Study of Diagnostic Test, study of nonconsecutive patients without a universally applied "gold standard").

2.
J Imaging Inform Med ; 2024 Jun 27.
Article in English | MEDLINE | ID: mdl-38937343

ABSTRACT

As the adoption of artificial intelligence (AI) systems in radiology grows, the increase in demand for greater bandwidth and computational resources can lead to greater infrastructural costs for healthcare providers and AI vendors. To that end, we developed ISLE, an intelligent streaming framework to address inefficiencies in current imaging infrastructures. Our framework draws inspiration from video-on-demand platforms to intelligently stream medical images to AI vendors at an optimal resolution for inference from a single high-resolution copy using progressive encoding. We hypothesize that ISLE can dramatically reduce the bandwidth and computational requirements for AI inference, while increasing throughput (i.e., the number of scans processed by the AI system per second). We evaluate our framework by streaming chest X-rays for classification and abdomen CT scans for liver and spleen segmentation and comparing them with the original versions of each dataset. For classification, our results show that ISLE reduced data transmission and decoding time by at least 92% and 88%, respectively, while increasing throughput by more than 3.72 × . For both segmentation tasks, ISLE reduced data transmission and decoding time by at least 82% and 88%, respectively, while increasing throughput by more than 2.9 × . In all three tasks, the ISLE streamed data had no impact on the AI system's diagnostic performance (all P > 0.05). Therefore, our results indicate that our framework can address inefficiencies in current imaging infrastructures by improving data and computational efficiency of AI deployments in the clinical environment without impacting clinical decision-making using AI systems.

4.
J Imaging Inform Med ; 2024 May 06.
Article in English | MEDLINE | ID: mdl-38710971

ABSTRACT

Saliency maps are popularly used to "explain" decisions made by modern machine learning models, including deep convolutional neural networks (DCNNs). While the resulting heatmaps purportedly indicate important image features, their "trustworthiness," i.e., utility and robustness, has not been evaluated for musculoskeletal imaging. The purpose of this study was to systematically evaluate the trustworthiness of saliency maps used in disease diagnosis on upper extremity X-ray images. The underlying DCNNs were trained using the Stanford MURA dataset. We studied four trustworthiness criteria-(1) localization accuracy of abnormalities, (2) repeatability, (3) reproducibility, and (4) sensitivity to underlying DCNN weights-across six different gradient-based saliency methods (Grad-CAM (GCAM), gradient explanation (GRAD), integrated gradients (IG), Smoothgrad (SG), smooth IG (SIG), and XRAI). Ground-truth was defined by the consensus of three fellowship-trained musculoskeletal radiologists who each placed bounding boxes around abnormalities on a holdout saliency test set. Compared to radiologists, all saliency methods showed inferior localization (AUPRCs: 0.438 (SG)-0.590 (XRAI); average radiologist AUPRC: 0.816), repeatability (IoUs: 0.427 (SG)-0.551 (IG); average radiologist IOU: 0.613), and reproducibility (IoUs: 0.250 (SG)-0.502 (XRAI); average radiologist IOU: 0.613) on abnormalities such as fractures, orthopedic hardware insertions, and arthritis. Five methods (GCAM, GRAD, IG, SG, XRAI) passed the sensitivity test. Ultimately, no saliency method met all four trustworthiness criteria; therefore, we recommend caution and rigorous evaluation of saliency maps prior to their clinical use.

5.
AJR Am J Roentgenol ; 2024 05 08.
Article in English | MEDLINE | ID: mdl-38717241

ABSTRACT

The large language model GPT-4 showed limited utility in generating BI-RADS assessment categories for factitious breast imaging reports containing findings and impression sections, with frequent incorrect BI-RADS category assignments and poor reproducibility in assigned BI-RADS categories across independent tests for the same report using the same prompt.

6.
AJR Am J Roentgenol ; 2024 Apr 10.
Article in English | MEDLINE | ID: mdl-38598354

ABSTRACT

Large language models (LLMs) hold immense potential to revolutionize radiology. However, their integration into practice requires careful consideration. Artificial intelligence (AI) chatbots and general-purpose LLMs have potential pitfalls related to privacy, transparency, and accuracy, limiting their current clinical readiness. Thus, LLM-based tools must be optimized for radiology practice to overcome these limitations. While research and validation for radiology applications remain in their infancy, commercial products incorporating LLMs are becoming available alongside promises of transforming practice. To help radiologists navigate this landscape, this AJR Expert Panel Narrative Review provides a multidimensional perspective on LLMs, encompassing considerations from bench (development and optimization) to bedside (use in practice). At present, LLMs are not autonomous entities that can replace expert decision-making, and radiologists remain responsible for the content of their reports. Patient-facing tools, particularly medical AI chatbots, require additional guardrails to ensure safety and prevent misuse. Still, if responsibly implemented, LLMs are well-positioned to transform efficiency and quality in radiology. Radiologists must be well-informed and proactively involved in guiding the implementation of LLMs in practice to mitigate risks and maximize benefits to patient care.

9.
NPJ Digit Med ; 7(1): 80, 2024 Mar 26.
Article in English | MEDLINE | ID: mdl-38531952

ABSTRACT

As applications of AI in medicine continue to expand, there is an increasing focus on integration into clinical practice. An underappreciated aspect of this clinical translation is where the AI fits into the clinical workflow, and in turn, the outputs generated by the AI to facilitate clinician interaction in this workflow. For instance, in the canonical use case of AI for medical image interpretation, the AI could prioritize cases before clinician review or even autonomously interpret the images without clinician review. A related aspect is explainability - does the AI generate outputs to help explain its predictions to clinicians? While many clinical AI workflows and explainability techniques have been proposed, a summative assessment of the current scope in clinical practice is lacking. Here, we evaluate the current state of FDA-cleared AI devices for medical image interpretation assistance in terms of intended clinical use, outputs generated, and types of explainability offered. We create a curated database focused on these aspects of the clinician-AI interface, where we find a high frequency of "triage" devices, notable variability in output characteristics across products, and often limited explainability of AI predictions. Altogether, we aim to increase transparency of the current landscape of the clinician-AI interface and highlight the need to rigorously assess which strategies ultimately lead to the best clinical outcomes.

10.
Radiol Artif Intell ; 6(3): e230240, 2024 May.
Article in English | MEDLINE | ID: mdl-38477660

ABSTRACT

Purpose To evaluate the robustness of an award-winning bone age deep learning (DL) model to extensive variations in image appearance. Materials and Methods In December 2021, the DL bone age model that won the 2017 RSNA Pediatric Bone Age Challenge was retrospectively evaluated using the RSNA validation set (1425 pediatric hand radiographs; internal test set in this study) and the Digital Hand Atlas (DHA) (1202 pediatric hand radiographs; external test set). Each test image underwent seven types of transformations (rotations, flips, brightness, contrast, inversion, laterality marker, and resolution) to represent a range of image appearances, many of which simulate real-world variations. Computational "stress tests" were performed by comparing the model's predictions on baseline and transformed images. Mean absolute differences (MADs) of predicted bone ages compared with radiologist-determined ground truth on baseline versus transformed images were compared using Wilcoxon signed rank tests. The proportion of clinically significant errors (CSEs) was compared using McNemar tests. Results There was no evidence of a difference in MAD of the model on the two baseline test sets (RSNA = 6.8 months, DHA = 6.9 months; P = .05), indicating good model generalization to external data. Except for the RSNA dataset images with an appended radiologic laterality marker (P = .86), there were significant differences in MAD for both the DHA and RSNA datasets among other transformation groups (rotations, flips, brightness, contrast, inversion, and resolution). There were significant differences in proportion of CSEs for 57% of the image transformations (19 of 33) performed on the DHA dataset. Conclusion Although an award-winning pediatric bone age DL model generalized well to curated external images, it had inconsistent predictions on images that had undergone simple transformations reflective of several real-world variations in image appearance. Keywords: Pediatrics, Hand, Convolutional Neural Network, Radiography Supplemental material is available for this article. © RSNA, 2024 See also commentary by Faghani and Erickson in this issue.


Subject(s)
Age Determination by Skeleton , Deep Learning , Child , Humans , Algorithms , Neural Networks, Computer , Radiography , Retrospective Studies , Age Determination by Skeleton/methods
11.
Radiol Imaging Cancer ; 6(2): e230086, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38305716

ABSTRACT

Purpose To evaluate the use of ChatGPT as a tool to simplify answers to common questions about breast cancer prevention and screening. Materials and Methods In this retrospective, exploratory study, ChatGPT was requested to simplify responses to 25 questions about breast cancer to a sixth-grade reading level in March and August 2023. Simplified responses were evaluated for clinical appropriateness. All original and simplified responses were assessed for reading ease on the Flesch Reading Ease Index and for readability on five scales: Flesch-Kincaid Grade Level, Gunning Fog Index, Coleman-Liau Index, Automated Readability Index, and the Simple Measure of Gobbledygook (ie, SMOG) Index. Mean reading ease, readability, and word count were compared between original and simplified responses using paired t tests. McNemar test was used to compare the proportion of responses with adequate reading ease (score of 60 or greater) and readability (sixth-grade level). Results ChatGPT improved mean reading ease (original responses, 46 vs simplified responses, 70; P < .001) and readability (original, grade 13 vs simplified, grade 8.9; P < .001) and decreased word count (original, 193 vs simplified, 173; P < .001). Ninety-two percent (23 of 25) of simplified responses were considered clinically appropriate. All 25 (100%) simplified responses met criteria for adequate reading ease, compared with only two of 25 original responses (P < .001). Two of the 25 simplified responses (8%) met criteria for adequate readability. Conclusion ChatGPT simplified answers to common breast cancer screening and prevention questions by improving the readability by four grade levels, though the potential to produce incorrect information necessitates physician oversight when using this tool. Keywords: Mammography, Screening, Informatics, Breast, Education, Health Policy and Practice, Oncology, Technology Assessment Supplemental material is available for this article. © RSNA, 2023.


Subject(s)
Breast Neoplasms , Health Literacy , Humans , Female , Breast Neoplasms/diagnosis , Breast Neoplasms/prevention & control , Early Detection of Cancer , Retrospective Studies , Patient-Centered Care
12.
Skeletal Radiol ; 53(8): 1621-1624, 2024 Aug.
Article in English | MEDLINE | ID: mdl-38270616

ABSTRACT

OBJECTIVE: To assess the feasibility of using large language models (LLMs), specifically ChatGPT-4, to generate concise and accurate layperson summaries of musculoskeletal radiology reports. METHODS: Sixty radiology reports, comprising 20 MR shoulder, 20 MR knee, and 20 MR lumbar spine reports, were obtained via PACS. The reports were deidentified and then submitted to ChatGPT-4, with the prompt "Produce an organized and concise layperson summary of the findings of the following radiology report. Target a reading level of 8-9th grade and word count <300 words." Three (two primary and one later added for validation) independent readers evaluated the summaries for completeness and accuracy compared to the original reports. Summaries were rated on a scale of 1 to 3: 1) summaries that were incorrect or incomplete, potentially providing harmful or confusing information; 2) summaries that were mostly correct and complete, unlikely to cause confusion or harm; and 3) summaries that were entirely correct and complete. RESULTS: All 60 responses met the criteria for word count and readability. Mean ratings for accuracy were 2.58 for reader 1, 2.71 for reader 2, and 2.77 for reader 3. Mean ratings for completeness were 2.87 for reader 1 and 2.73 for reader 2 and 2.87 for reader 3. For accuracy, reader 1 identified three summaries as a 1, reader 2 identified one, and reader 3 identified none. For the two primary readers, inter-reader agreement was low for accuracy (kappa 0.33) and completeness (kappa 0.29). There were no statistically significant changes in inter-reader agreement when the third reader's ratings were included in analysis. CONCLUSION: Overall ratings for accuracy and completeness of the AI-generated layperson report summaries were high with only a small minority likely to be confusing or inaccurate. This study illustrates the potential for leveraging generative AI, such as ChatGPT-4, to automate the production of patient-friendly summaries for musculoskeletal MR imaging.


Subject(s)
Radiology Information Systems , Humans , Musculoskeletal Diseases/diagnostic imaging , Feasibility Studies , Translating , Comprehension
13.
JAMA ; 331(8): 637-638, 2024 02 27.
Article in English | MEDLINE | ID: mdl-38285439

ABSTRACT

This Viewpoint discusses AI-generated clinical summaries and the necessity of transparent development of standards for their safe rollout.


Subject(s)
Artificial Intelligence , Medical Records , Patient Discharge , Humans , Data Accuracy
14.
Radiol Artif Intell ; 6(1): e230159, 2024 Jan.
Article in English | MEDLINE | ID: mdl-38294324

ABSTRACT

Purpose To compare the effectiveness of weak supervision (ie, with examination-level labels only) and strong supervision (ie, with image-level labels) in training deep learning models for detection of intracranial hemorrhage (ICH) on head CT scans. Materials and Methods In this retrospective study, an attention-based convolutional neural network was trained with either local (ie, image level) or global (ie, examination level) binary labels on the Radiological Society of North America (RSNA) 2019 Brain CT Hemorrhage Challenge dataset of 21 736 examinations (8876 [40.8%] ICH) and 752 422 images (107 784 [14.3%] ICH). The CQ500 (436 examinations; 212 [48.6%] ICH) and CT-ICH (75 examinations; 36 [48.0%] ICH) datasets were employed for external testing. Performance in detecting ICH was compared between weak (examination-level labels) and strong (image-level labels) learners as a function of the number of labels available during training. Results On examination-level binary classification, strong and weak learners did not have different area under the receiver operating characteristic curve values on the internal validation split (0.96 vs 0.96; P = .64) and the CQ500 dataset (0.90 vs 0.92; P = .15). Weak learners outperformed strong ones on the CT-ICH dataset (0.95 vs 0.92; P = .03). Weak learners had better section-level ICH detection performance when more than 10 000 labels were available for training (average f1 = 0.73 vs 0.65; P < .001). Weakly supervised models trained on the entire RSNA dataset required 35 times fewer labels than equivalent strong learners. Conclusion Strongly supervised models did not achieve better performance than weakly supervised ones, which could reduce radiologist labor requirements for prospective dataset curation. Keywords: CT, Head/Neck, Brain/Brain Stem, Hemorrhage Supplemental material is available for this article. © RSNA, 2023 See also commentary by Wahid and Fuentes in this issue.


Subject(s)
Deep Learning , Humans , Prospective Studies , Retrospective Studies , Intracranial Hemorrhages/diagnostic imaging , Tomography, X-Ray Computed
15.
AJR Am J Roentgenol ; 222(3): e2330548, 2024 03.
Article in English | MEDLINE | ID: mdl-38170831

ABSTRACT

A multidisciplinary physician team rated information provided by ChatGPT regarding breast pathologic diagnoses. ChatGPT responses were mostly appropriate regarding accuracy, consistency, definitions provided, and clinical significance conveyed. Responses were scored lower in terms of management recommendations provided, primarily related to low agreement with recommendations for high-risk lesions.

16.
AJR Am J Roentgenol ; 222(4): e2330573, 2024 04.
Article in English | MEDLINE | ID: mdl-38230901

ABSTRACT

GPT-4 outperformed a radiology domain-specific natural language processing model in classifying imaging findings from chest radiograph reports, both with and without predefined labels. Prompt engineering for context further improved performance. The findings indicate a role for large language models to accelerate artificial intelligence model development in radiology by automating data annotation.


Subject(s)
Natural Language Processing , Radiography, Thoracic , Humans , Radiography, Thoracic/methods , Radiology Information Systems
17.
Pain Pract ; 24(1): 177-185, 2024 Jan.
Article in English | MEDLINE | ID: mdl-37638532

ABSTRACT

INTRODUCTION: Neuromodulation has emerged as a promising therapy for the management of chronic pain, movement disorders, and other neurological conditions. Spinal cord stimulation (SCS) is a widely used form of neuromodulation that involves the delivery of electrical impulses to the spinal cord to modulate the transmission of pain signals to the brain. In recent years, there has been increasing interest in the use of automation systems to improve the efficacy and safety of SCS. This narrative review summarizes the status of Food and Drug Administration-approved autonomous neuromodulation devices including closed loop, feedforward, and feedback systems. The review discusses the advantages and disadvantages of each system and focuses specifically on the use of these systems for SCS. It is important for clinicians to understand the expanding role of automation in neuromodulation in order to select appropriate therapies founded on automation systems to the specific needs of the patient and the underlying condition. CONCLUSION: The review also provides insights into the current state of the art in neuromodulation automation systems and discusses potential future directions for research in this field.


Subject(s)
Chronic Pain , Spinal Cord Stimulation , Humans , Chronic Pain/therapy , Pain Management , Brain , Spinal Cord/physiology
18.
AJR Am J Roentgenol ; 222(3): e2329530, 2024 Mar.
Article in English | MEDLINE | ID: mdl-37436032

ABSTRACT

Artificial intelligence (AI) is increasingly used in clinical practice for musculoskeletal imaging tasks, such as disease diagnosis and image reconstruction. AI applications in musculoskeletal imaging have focused primarily on radiography, CT, and MRI. Although musculoskeletal ultrasound stands to benefit from AI in similar ways, such applications have been relatively underdeveloped. In comparison with other modalities, ultrasound has unique advantages and disadvantages that must be considered in AI algorithm development and clinical translation. Challenges in developing AI for musculoskeletal ultrasound involve both clinical aspects of image acquisition and practical limitations in image processing and annotation. Solutions from other radiology subspecialties (e.g., crowdsourced annotations coordinated by professional societies), along with use cases (most commonly rotator cuff tendon tears and palpable soft-tissue masses), can be applied to musculoskeletal ultrasound to help develop AI. To facilitate creation of high-quality imaging datasets for AI model development, technologists and radiologists should focus on increasing uniformity in musculoskeletal ultrasound performance and increasing annotations of images for specific anatomic regions. This Expert Panel Narrative Review summarizes available evidence regarding AI's potential utility in musculoskeletal ultrasound and challenges facing its development. Recommendations for future AI advancement and clinical translation in musculoskeletal ultrasound are discussed.


Subject(s)
Artificial Intelligence , Tendons , Humans , Ultrasonography , Algorithms , Head
19.
Skeletal Radiol ; 53(3): 445-454, 2024 Mar.
Article in English | MEDLINE | ID: mdl-37584757

ABSTRACT

OBJECTIVE: The purpose of this systematic review was to summarize the results of original research studies evaluating the characteristics and performance of deep learning models for detection of knee ligament and meniscus tears on MRI. MATERIALS AND METHODS: We searched PubMed for studies published as of February 2, 2022 for original studies evaluating development and evaluation of deep learning models for MRI diagnosis of knee ligament or meniscus tears. We summarized study details according to multiple criteria including baseline article details, model creation, deep learning details, and model evaluation. RESULTS: 19 studies were included with radiology departments leading the publications in deep learning development and implementation for detecting knee injuries via MRI. Among the studies, there was a lack of standard reporting and inconsistently described development details. However, all included studies reported consistently high model performance that significantly supplemented human reader performance. CONCLUSION: From our review, we found radiology departments have been leading deep learning development for injury detection on knee MRIs. Although studies inconsistently described DL model development details, all reported high model performance, indicating great promise for DL in knee MRI analysis.


Subject(s)
Anterior Cruciate Ligament Injuries , Artificial Intelligence , Ligaments, Articular , Meniscus , Humans , Anterior Cruciate Ligament Injuries/diagnostic imaging , Ligaments, Articular/diagnostic imaging , Ligaments, Articular/injuries , Magnetic Resonance Imaging/methods , Meniscus/diagnostic imaging , Meniscus/injuries
20.
J Am Coll Radiol ; 21(2): 248-256, 2024 Feb.
Article in English | MEDLINE | ID: mdl-38072221

ABSTRACT

Radiology is on the verge of a technological revolution driven by artificial intelligence (including large language models), which requires robust computing and storage capabilities, often beyond the capacity of current non-cloud-based informatics systems. The cloud presents a potential solution for radiology, and we should weigh its economic and environmental implications. Recently, cloud technologies have become a cost-effective strategy by providing necessary infrastructure while reducing expenditures associated with hardware ownership, maintenance, and upgrades. Simultaneously, given the optimized energy consumption in modern cloud data centers, this transition is expected to reduce the environmental footprint of radiologic operations. The path to cloud integration comes with its own challenges, and radiology informatics leaders must consider elements such as cloud architectural choices, pricing, data security, uptime service agreements, user training and support, and broader interoperability. With the increasing importance of data-driven tools in radiology, understanding and navigating the cloud landscape will be essential for the future of radiology and its various stakeholders.


Subject(s)
Artificial Intelligence , Radiology , Cloud Computing , Costs and Cost Analysis , Diagnostic Imaging
SELECTION OF CITATIONS
SEARCH DETAIL
...