Pesquisa | Portal Regional da BVS (teste)

1.

ChatGPT's diagnostic performance based on textual vs. visual information compared to radiologists' diagnostic performance in musculoskeletal radiology.

Horiuchi, Daisuke; Tatekawa, Hiroyuki; Oura, Tatsushi; Shimono, Taro; Walston, Shannon L; Takita, Hirotaka; Matsushita, Shu; Mitsuyama, Yasuhito; Miki, Yukio; Ueda, Daiju.

Eur Radiol ; 2024 Jul 12.

Artigo em Inglês | MEDLINE | ID: mdl-38995378

RESUMO

OBJECTIVES: To compare the diagnostic accuracy of Generative Pre-trained Transformer (GPT)-4-based ChatGPT, GPT-4 with vision (GPT-4V) based ChatGPT, and radiologists in musculoskeletal radiology. MATERIALS AND METHODS: We included 106 "Test Yourself" cases from Skeletal Radiology between January 2014 and September 2023. We input the medical history and imaging findings into GPT-4-based ChatGPT and the medical history and images into GPT-4V-based ChatGPT, then both generated a diagnosis for each case. Two radiologists (a radiology resident and a board-certified radiologist) independently provided diagnoses for all cases. The diagnostic accuracy rates were determined based on the published ground truth. Chi-square tests were performed to compare the diagnostic accuracy of GPT-4-based ChatGPT, GPT-4V-based ChatGPT, and radiologists. RESULTS: GPT-4-based ChatGPT significantly outperformed GPT-4V-based ChatGPT (p < 0.001) with accuracy rates of 43% (46/106) and 8% (9/106), respectively. The radiology resident and the board-certified radiologist achieved accuracy rates of 41% (43/106) and 53% (56/106). The diagnostic accuracy of GPT-4-based ChatGPT was comparable to that of the radiology resident, but was lower than that of the board-certified radiologist although the differences were not significant (p = 0.78 and 0.22, respectively). The diagnostic accuracy of GPT-4V-based ChatGPT was significantly lower than those of both radiologists (p < 0.001 and < 0.001, respectively). CONCLUSION: GPT-4-based ChatGPT demonstrated significantly higher diagnostic accuracy than GPT-4V-based ChatGPT. While GPT-4-based ChatGPT's diagnostic performance was comparable to radiology residents, it did not reach the performance level of board-certified radiologists in musculoskeletal radiology. CLINICAL RELEVANCE STATEMENT: GPT-4-based ChatGPT outperformed GPT-4V-based ChatGPT and was comparable to radiology residents, but it did not reach the level of board-certified radiologists in musculoskeletal radiology. Radiologists should comprehend ChatGPT's current performance as a diagnostic tool for optimal utilization. KEY POINTS: This study compared the diagnostic performance of GPT-4-based ChatGPT, GPT-4V-based ChatGPT, and radiologists in musculoskeletal radiology. GPT-4-based ChatGPT was comparable to radiology residents, but did not reach the level of board-certified radiologists. When utilizing ChatGPT, it is crucial to input appropriate descriptions of imaging findings rather than the images.

2.

A deep learning-based model to estimate pulmonary function from chest x-rays: multi-institutional model development and validation study in Japan.

Ueda, Daiju; Matsumoto, Toshimasa; Yamamoto, Akira; Walston, Shannon L; Mitsuyama, Yasuhito; Takita, Hirotaka; Asai, Kazuhisa; Watanabe, Tetsuya; Abo, Koji; Kimura, Tatsuo; Fukumoto, Shinya; Watanabe, Toshio; Takeshita, Tohru; Miki, Yukio.

Lancet Digit Health ; 2024 Jul 08.

Artigo em Inglês | MEDLINE | ID: mdl-38981834

RESUMO

BACKGROUND: Chest x-ray is a basic, cost-effective, and widely available imaging method that is used for static assessments of organic diseases and anatomical abnormalities, but its ability to estimate dynamic measurements such as pulmonary function is unknown. We aimed to estimate two major pulmonary functions from chest x-rays. METHODS: In this retrospective model development and validation study, we trained, validated, and externally tested a deep learning-based artificial intelligence (AI) model to estimate forced vital capacity (FVC) and forced expiratory volume in 1 s (FEV1) from chest x-rays. We included consecutively collected results of spirometry and any associated chest x-rays that had been obtained between July 1, 2003, and Dec 31, 2021, from five institutions in Japan (labelled institutions A-E). Eligible x-rays had been acquired within 14 days of spirometry and were labelled with the FVC and FEV1. X-rays from three institutions (A-C) were used for training, validation, and internal testing, with the testing dataset being independent of the training and validation datasets, and then x-rays from the two other institutions (D and E) were used for independent external testing. Performance for estimating FVC and FEV1 was evaluated by calculating the Pearson's correlation coefficient (r), intraclass correlation coefficient (ICC), mean square error (MSE), root mean square error (RMSE), and mean absolute error (MAE) compared with the results of spirometry. FINDINGS: We included 141 734 x-ray and spirometry pairs from 81 902 patients from the five institutions. The training, validation, and internal test datasets included 134 307 x-rays from 75 768 patients (37 718 [50%] female, 38 050 [50%] male; mean age 56 years [SD 18]), and the external test datasets included 2137 x-rays from 1861 patients (742 [40%] female, 1119 [60%] male; mean age 65 years [SD 17]) from institution D and 5290 x-rays from 4273 patients (1972 [46%] female, 2301 [54%] male; mean age 63 years [SD 17]) from institution E. External testing for FVC yielded r values of 0·91 (99% CI 0·90-0·92) for institution D and 0·90 (0·89-0·91) for institution E, ICC of 0·91 (99% CI 0·90-0·92) and 0·89 (0·88-0·90), MSE of 0·17 L2 (99% CI 0·15-0·19) and 0·17 L2 (0·16-0·19), RMSE of 0·41 L (99% CI 0·39-0·43) and 0·41 L (0·39-0·43), and MAE of 0·31 L (99% CI 0·29-0·32) and 0·31 L (0·30-0·32). External testing for FEV1 yielded r values of 0·91 (99% CI 0·90-0·92) for institution D and 0·91 (0·90-0·91) for institution E, ICC of 0·90 (99% CI 0·89-0·91) and 0·90 (0·90-0·91), MSE of 0·13 L2 (99% CI 0·12-0·15) and 0·11 L2 (0·10-0·12), RMSE of 0·37 L (99% CI 0·35-0·38) and 0·33 L (0·32-0·35), and MAE of 0·28 L (99% CI 0·27-0·29) and 0·25 L (0·25-0·26). INTERPRETATION: This deep learning model allowed estimation of FVC and FEV1 from chest x-rays, showing high agreement with spirometry. The model offers an alternative to spirometry for assessing pulmonary function, which is especially useful for patients who are unable to undergo spirometry, and might enhance the customisation of CT imaging protocols based on insights gained from chest x-rays, improving the diagnosis and management of lung diseases. Future studies should investigate the performance of this AI model in combination with clinical information to enable more appropriate and targeted use. FUNDING: None.

3.

Data set terminology of deep learning in medicine: a historical review and recommendation.

Walston, Shannon L; Seki, Hiroshi; Takita, Hirotaka; Mitsuyama, Yasuhito; Sato, Shingo; Hagiwara, Akifumi; Ito, Rintaro; Hanaoka, Shouhei; Miki, Yukio; Ueda, Daiju.

Jpn J Radiol ; 2024 Jun 10.

Artigo em Inglês | MEDLINE | ID: mdl-38856878

RESUMO

Medicine and deep learning-based artificial intelligence (AI) engineering represent two distinct fields each with decades of published history. The current rapid convergence of deep learning and medicine has led to significant advancements, yet it has also introduced ambiguity regarding data set terms common to both fields, potentially leading to miscommunication and methodological discrepancies. This narrative review aims to give historical context for these terms, accentuate the importance of clarity when these terms are used in medical deep learning contexts, and offer solutions to mitigate misunderstandings by readers from either field. Through an examination of historical documents, including articles, writing guidelines, and textbooks, this review traces the divergent evolution of terms for data sets and their impact. Initially, the discordant interpretations of the word 'validation' in medical and AI contexts are explored. We then show that in the medical field as well, terms traditionally used in the deep learning domain are becoming more common, with the data for creating models referred to as the 'training set', the data for tuning of parameters referred to as the 'validation (or tuning) set', and the data for the evaluation of models as the 'test set'. Additionally, the test sets used for model evaluation are classified into internal (random splitting, cross-validation, and leave-one-out) sets and external (temporal and geographic) sets. This review then identifies often misunderstood terms and proposes pragmatic solutions to mitigate terminological confusion in the field of deep learning in medicine. We support the accurate and standardized description of these data sets and the explicit definition of data set splitting terminologies in each publication. These are crucial methods for demonstrating the robustness and generalizability of deep learning applications in medicine. This review aspires to enhance the precision of communication, thereby fostering more effective and transparent research methodologies in this interdisciplinary field.

4.

Seeking Details on Artificial Intelligence Diagnosis for Acute Otitis Media.

Walston, Shannon L; Sato, Shingo; Ueda, Daiju.

JAMA Pediatr ; 2024 Jun 24.

Artigo em Inglês | MEDLINE | ID: mdl-38913356

5.

Enhancing Self-Supervised Learning for Rare Diseases in OCT.

Walston, Shannon L; Sato, Shingo; Ueda, Daiju.

JAMA Ophthalmol ; 142(7): 688, 2024 Jul 01.

Artigo em Inglês | MEDLINE | ID: mdl-38842901

Assuntos

Doenças Raras , Tomografia de Coerência Óptica , Humanos , Tomografia de Coerência Óptica/métodos , Aprendizado de Máquina Supervisionado

6.

Climate change and artificial intelligence in healthcare: Review and recommendations towards a sustainable future.

Ueda, Daiju; Walston, Shannon L; Fujita, Shohei; Fushimi, Yasutaka; Tsuboyama, Takahiro; Kamagata, Koji; Yamada, Akira; Yanagawa, Masahiro; Ito, Rintaro; Fujima, Noriyuki; Kawamura, Mariko; Nakaura, Takeshi; Matsui, Yusuke; Tatsugami, Fuminari; Fujioka, Tomoyuki; Nozaki, Taiki; Hirata, Kenji; Naganawa, Shinji.

Diagn Interv Imaging ; 2024 Jun 24.

Artigo em Inglês | MEDLINE | ID: mdl-38918123

RESUMO

The rapid advancement of artificial intelligence (AI) in healthcare has revolutionized the industry, offering significant improvements in diagnostic accuracy, efficiency, and patient outcomes. However, the increasing adoption of AI systems also raises concerns about their environmental impact, particularly in the context of climate change. This review explores the intersection of climate change and AI in healthcare, examining the challenges posed by the energy consumption and carbon footprint of AI systems, as well as the potential solutions to mitigate their environmental impact. The review highlights the energy-intensive nature of AI model training and deployment, the contribution of data centers to greenhouse gas emissions, and the generation of electronic waste. To address these challenges, the development of energy-efficient AI models, the adoption of green computing practices, and the integration of renewable energy sources are discussed as potential solutions. The review also emphasizes the role of AI in optimizing healthcare workflows, reducing resource waste, and facilitating sustainable practices such as telemedicine. Furthermore, the importance of policy and governance frameworks, global initiatives, and collaborative efforts in promoting sustainable AI practices in healthcare is explored. The review concludes by outlining best practices for sustainable AI deployment, including eco-design, lifecycle assessment, responsible data management, and continuous monitoring and improvement. As the healthcare industry continues to embrace AI technologies, prioritizing sustainability and environmental responsibility is crucial to ensure that the benefits of AI are realized while actively contributing to the preservation of our planet.

7.

Comparing the Diagnostic Performance of GPT-4-based ChatGPT, GPT-4V-based ChatGPT, and Radiologists in Challenging Neuroradiology Cases.

Horiuchi, Daisuke; Tatekawa, Hiroyuki; Oura, Tatsushi; Oue, Satoshi; Walston, Shannon L; Takita, Hirotaka; Matsushita, Shu; Mitsuyama, Yasuhito; Shimono, Taro; Miki, Yukio; Ueda, Daiju.

Clin Neuroradiol ; 2024 May 28.

Artigo em Inglês | MEDLINE | ID: mdl-38806794

RESUMO

PURPOSE: To compare the diagnostic performance among Generative Pre-trained Transformer (GPT)-4-based ChatGPT, GPT4 with vision (GPT-4V) based ChatGPT, and radiologists in challenging neuroradiology cases. METHODS: We collected 32 consecutive "Freiburg Neuropathology Case Conference" cases from the journal Clinical Neuroradiology between March 2016 and December 2023. We input the medical history and imaging findings into GPT-4-based ChatGPT and the medical history and images into GPT-4V-based ChatGPT, then both generated a diagnosis for each case. Six radiologists (three radiology residents and three board-certified radiologists) independently reviewed all cases and provided diagnoses. ChatGPT and radiologists' diagnostic accuracy rates were evaluated based on the published ground truth. Chi-square tests were performed to compare the diagnostic accuracy of GPT-4-based ChatGPT, GPT-4V-based ChatGPT, and radiologists. RESULTS: GPT4 and GPT-4V-based ChatGPTs achieved accuracy rates of 22% (7/32) and 16% (5/32), respectively. Radiologists achieved the following accuracy rates: three radiology residents 28% (9/32), 31% (10/32), and 28% (9/32); and three board-certified radiologists 38% (12/32), 47% (15/32), and 44% (14/32). GPT-4-based ChatGPT's diagnostic accuracy was lower than each radiologist, although not significantly (all pâ¯> 0.07). GPT-4V-based ChatGPT's diagnostic accuracy was also lower than each radiologist and significantly lower than two board-certified radiologists (pâ¯= 0.02 and 0.03) (not significant for radiology residents and one board-certified radiologist [all pâ¯> 0.09]). CONCLUSION: While GPT-4-based ChatGPT demonstrated relatively higher diagnostic performance than GPT-4V-based ChatGPT, the diagnostic performance of GPT4 and GPT-4V-based ChatGPTs did not reach the performance level of either radiology residents or board-certified radiologists in challenging neuroradiology cases.

8.

Enhancing AI-assisted Interpretation of Chest Radiographs: A Critical Analysis of Methods and Applicability.

Walston, Shannon L; Ueda, Daiju.

Radiology ; 311(2): e233428, 2024 May.

Artigo em Inglês | MEDLINE | ID: mdl-38713026

Assuntos

Inteligência Artificial , Interpretação de Imagem Radiográfica Assistida por Computador , Radiografia Torácica , Humanos , Radiografia Torácica/métodos , Interpretação de Imagem Radiográfica Assistida por Computador/métodos

9.

Evaluating Biases and Quality Issues in Intermodality Image Translation Studies for Neuroradiology: A Systematic Review.

Walston, Shannon L; Tatekawa, Hiroyuki; Takita, Hirotaka; Miki, Yukio; Ueda, Daiju.

AJNR Am J Neuroradiol ; 45(6): 826-832, 2024 06 07.

Artigo em Inglês | MEDLINE | ID: mdl-38663993

RESUMO

BACKGROUND: Intermodality image-to-image translation is an artificial intelligence technique for generating one technique from another. PURPOSE: This review was designed to systematically identify and quantify biases and quality issues preventing validation and clinical application of artificial intelligence models for intermodality image-to-image translation of brain imaging. DATA SOURCES: PubMed, Scopus, and IEEE Xplore were searched through August 2, 2023, for artificial intelligence-based image translation models of radiologic brain images. STUDY SELECTION: This review collected 102 works published between April 2017 and August 2023. DATA ANALYSIS: Eligible studies were evaluated for quality using the Checklist for Artificial Intelligence in Medical Imaging (CLAIM) and for bias using the Prediction model Risk Of Bias ASsessment Tool (PROBAST). Medically-focused article adherence was compared with that of engineering-focused articles overall with the Mann-Whitney U test and for each criterion using the Fisher exact test. DATA SYNTHESIS: Median adherence to the relevant CLAIM criteria was 69% and 38% for PROBAST questions. CLAIM adherence was lower for engineering-focused articles compared with medically-focused articles (65% versus 73%, P < .001). Engineering-focused studies had higher adherence for model description criteria, and medically-focused studies had higher adherence for data set and evaluation descriptions. LIMITATIONS: Our review is limited by the study design and model heterogeneity. CONCLUSIONS: Nearly all studies revealed critical issues preventing clinical application, with engineering-focused studies showing higher adherence for the technical model description but significantly lower overall adherence than medically-focused studies. The pursuit of clinical application requires collaboration from both fields to improve reporting.

Assuntos

Neuroimagem , Humanos , Neuroimagem/métodos , Neuroimagem/normas , Viés , Inteligência Artificial

10.

Deep learning-based diffusion tensor image generation model: a proof-of-concept study.

Tatekawa, Hiroyuki; Ueda, Daiju; Takita, Hirotaka; Matsumoto, Toshimasa; Walston, Shannon L; Mitsuyama, Yasuhito; Horiuchi, Daisuke; Matsushita, Shu; Oura, Tatsushi; Tomita, Yuichiro; Tsukamoto, Taro; Shimono, Taro; Miki, Yukio.

Sci Rep ; 14(1): 2911, 2024 02 05.

Artigo em Inglês | MEDLINE | ID: mdl-38316892

RESUMO

This study created an image-to-image translation model that synthesizes diffusion tensor images (DTI) from conventional diffusion weighted images, and validated the similarities between the original and synthetic DTI. Thirty-two healthy volunteers were prospectively recruited. DTI and DWI were obtained with six and three directions of the motion probing gradient (MPG), respectively. The identical imaging plane was paired for the image-to-image translation model that synthesized one direction of the MPG from DWI. This process was repeated six times in the respective MPG directions. Regions of interest (ROIs) in the lentiform nucleus, thalamus, posterior limb of the internal capsule, posterior thalamic radiation, and splenium of the corpus callosum were created and applied to maps derived from the original and synthetic DTI. The mean values and signal-to-noise ratio (SNR) of the original and synthetic maps for each ROI were compared. The Bland-Altman plot between the original and synthetic data was evaluated. Although the test dataset showed a larger standard deviation of all values and lower SNR in the synthetic data than in the original data, the Bland-Altman plots showed each plot localizing in a similar distribution. Synthetic DTI could be generated from conventional DWI with an image-to-image translation model.

Assuntos

Aprendizado Profundo , Substância Branca , Humanos , Corpo Caloso/diagnóstico por imagem , Razão Sinal-Ruído , Cápsula Interna , Imagem de Difusão por Ressonância Magnética/métodos

11.

Accuracy of ChatGPT generated diagnosis from patient's medical history and imaging findings in neuroradiology cases.

Horiuchi, Daisuke; Tatekawa, Hiroyuki; Shimono, Taro; Walston, Shannon L; Takita, Hirotaka; Matsushita, Shu; Oura, Tatsushi; Mitsuyama, Yasuhito; Miki, Yukio; Ueda, Daiju.

Neuroradiology ; 66(1): 73-79, 2024 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-37994939

RESUMO

PURPOSE: The noteworthy performance of Chat Generative Pre-trained Transformer (ChatGPT), an artificial intelligence text generation model based on the GPT-4 architecture, has been demonstrated in various fields; however, its potential applications in neuroradiology remain unexplored. This study aimed to evaluate the diagnostic performance of GPT-4 based ChatGPT in neuroradiology. METHODS: We collected 100 consecutive "Case of the Week" cases from the American Journal of Neuroradiology between October 2021 and September 2023. ChatGPT generated a diagnosis from patient's medical history and imaging findings for each case. Then the diagnostic accuracy rate was determined using the published ground truth. Each case was categorized by anatomical location (brain, spine, and head & neck), and brain cases were further divided into central nervous system (CNS) tumor and non-CNS tumor groups. Fisher's exact test was conducted to compare the accuracy rates among the three anatomical locations, as well as between the CNS tumor and non-CNS tumor groups. RESULTS: ChatGPT achieved a diagnostic accuracy rate of 50% (50/100 cases). There were no significant differences between the accuracy rates of the three anatomical locations (p = 0.89). The accuracy rate was significantly lower for the CNS tumor group compared to the non-CNS tumor group in the brain cases (16% [3/19] vs. 62% [36/58], p < 0.001). CONCLUSION: This study demonstrated the diagnostic performance of ChatGPT in neuroradiology. ChatGPT's diagnostic accuracy varied depending on disease etiologies, and its diagnostic accuracy was significantly lower in CNS tumors compared to non-CNS tumors.

Assuntos

Inteligência Artificial , Neoplasias , Humanos , Cabeça , Encéfalo , Pescoço

12.

Challenges of using artificial intelligence to detect valvular heart disease from chest radiography - Authors' reply.

Ueda, Daiju; Ehara, Shoichi; Yamamoto, Akira; Walston, Shannon L; Shimono, Taro; Miki, Yukio.

Lancet Digit Health ; 6(1): e10, 2024 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-38123250

Assuntos

Inteligência Artificial , Doenças das Valvas Cardíacas , Humanos , Radiografia , Doenças das Valvas Cardíacas/diagnóstico por imagem

13.

Chest radiography as a biomarker of ageing: artificial intelligence-based, multi-institutional model development and validation in Japan.

Mitsuyama, Yasuhito; Matsumoto, Toshimasa; Tatekawa, Hiroyuki; Walston, Shannon L; Kimura, Tatsuo; Yamamoto, Akira; Watanabe, Toshio; Miki, Yukio; Ueda, Daiju.

Lancet Healthy Longev ; 4(9): e478-e486, 2023 09.

Artigo em Inglês | MEDLINE | ID: mdl-37597530

RESUMO

BACKGROUND: Chest radiographs are widely available and cost-effective; however, their usefulness as a biomarker of ageing using multi-institutional data remains underexplored. The aim of this study was to develop a biomarker of ageing from chest radiography and examine the correlation between the biomarker and diseases. METHODS: In this retrospective, multi-institutional study, we trained, tuned, and externally tested an artificial intelligence (AI) model to estimate the age of healthy individuals using chest radiographs as a biomarker. For the biomarker modelling phase of the study, we used healthy chest radiographs consecutively collected between May 22, 2008, and Dec 28, 2021, from three institutions in Japan. Data from two institutions were used for training, tuning, and internal testing, and data from the third institution were used for external testing. To evaluate the performance of the AI model in estimating ages, we calculated the correlation coefficient, mean square error, root mean square error, and mean absolute error. The correlation investigation phase of the study included chest radiographs from individuals with a known disease that were consecutively collected between Jan 1, 2018, and Dec 31, 2021, from an additional two institutions in Japan. We investigated the odds ratios (ORs) for various diseases given the difference between the AI-estimated age and chronological age (ie, the difference-age). FINDINGS: We included 101â296 chest radiographs from 70â248 participants across five institutions. In the biomarker modelling phase, the external test dataset from 3467 healthy participants included 8046 radiographs. Between the AI-estimated age and chronological age, the correlation coefficient was 0·95 (99% CI 0·95-0·95), the mean square error was 15·0 years (99% CI 14·0-15·0), the root mean square error was 3·8 years (99% CI 3·8-3·9), and the mean absolute error was 3·0 years (99% CI 3·0-3·1). In the correlation investigation phase, the external test datasets from 34â197 participants with a known disease included 34â197 radiographs. The ORs for difference-age were as follows: 1·04 (99% CI 1·04-1·05) for hypertension; 1·02 (1·01-1·03) for hyperuricaemia; 1·05 (1·03-1·06) for chronic obstructive pulmonary disease; 1·08 (1·06-1·09) for interstitial lung disease; 1·05 (1·03-1·06) for chronic renal failure; 1·04 (1·03-1·06) for atrial fibrillation; 1·03 (1·02-1·04) for osteoporosis; and 1·05 (1·03-1·06) for liver cirrhosis. INTERPRETATION: The AI-estimated age using chest radiographs showed a strong correlation with chronological age in the healthy cohorts. Furthermore, in cohorts of individuals with known diseases, the difference between estimated age and chronological age correlated with various chronic diseases. The use of this biomarker might pave the way for enhanced risk stratification methodologies, individualised therapeutic interventions, and innovative early diagnostic and preventive approaches towards age-associated pathologies. FUNDING: None. TRANSLATION: For the Japanese translation of the abstract see Supplementary Materials section.

Assuntos

Envelhecimento , Inteligência Artificial , Humanos , Japão , Estudos Retrospectivos , Biomarcadores

14.

AI-based Virtual Synthesis of Methionine PET from Contrast-enhanced MRI: Development and External Validation Study.

Takita, Hirotaka; Matsumoto, Toshimasa; Tatekawa, Hiroyuki; Katayama, Yutaka; Nakajo, Kosuke; Uda, Takehiro; Mitsuyama, Yasuhito; Walston, Shannon L; Miki, Yukio; Ueda, Daiju.

Radiology ; 308(2): e223016, 2023 08.

Artigo em Inglês | MEDLINE | ID: mdl-37526545

RESUMO

Background Carbon 11 (11C)-methionine is a useful PET radiotracer for the management of patients with glioma, but radiation exposure and lack of molecular imaging facilities limit its use. Purpose To generate synthetic methionine PET images from contrast-enhanced (CE) MRI through an artificial intelligence (AI)-based image-to-image translation model and to compare its performance for grading and prognosis of gliomas with that of real PET. Materials and Methods An AI-based model to generate synthetic methionine PET images from CE MRI was developed and validated from patients who underwent both methionine PET and CE MRI at a university hospital from January 2007 to December 2018 (institutional data set). Pearson correlation coefficients for the maximum and mean tumor to background ratio (TBRmax and TBRmean, respectively) of methionine uptake and the lesion volume between synthetic and real PET were calculated. Two additional open-source glioma databases of preoperative CE MRI without methionine PET were used as the external test set. Using the TBRs, the area under the receiver operating characteristic curve (AUC) for classifying high-grade and low-grade gliomas and overall survival were evaluated. Results The institutional data set included 362 patients (mean age, 49 years ± 19 [SD]; 195 female, 167 male; training, n = 294; validation, n = 34; test, n = 34). In the internal test set, Pearson correlation coefficients were 0.68 (95% CI: 0.47, 0.81), 0.76 (95% CI: 0.59, 0.86), and 0.92 (95% CI: 0.85, 0.95) for TBRmax, TBRmean, and lesion volume, respectively. The external test set included 344 patients with gliomas (mean age, 53 years ± 15; 192 male, 152 female; high grade, n = 269). The AUC for TBRmax was 0.81 (95% CI: 0.75, 0.86) and the overall survival analysis showed a significant difference between the high (2-year survival rate, 27%) and low (2-year survival rate, 71%; P < .001) TBRmax groups. Conclusion The AI-based model-generated synthetic methionine PET images strongly correlated with real PET images and showed good performance for glioma grading and prognostication. Published under a CC BY 4.0 license. Supplemental material is available for this article.

Assuntos

Neoplasias Encefálicas , Glioma , Humanos , Masculino , Feminino , Pessoa de Meia-Idade , Metionina , Neoplasias Encefálicas/diagnóstico por imagem , Neoplasias Encefálicas/patologia , Inteligência Artificial , Tomografia por Emissão de Pósitrons/métodos , Gradação de Tumores , Glioma/diagnóstico por imagem , Glioma/patologia , Imageamento por Ressonância Magnética/métodos , Racemetionina

15.

Artificial intelligence-based model to classify cardiac functions from chest radiographs: a multi-institutional, retrospective model development and validation study.

Ueda, Daiju; Matsumoto, Toshimasa; Ehara, Shoichi; Yamamoto, Akira; Walston, Shannon L; Ito, Asahiro; Shimono, Taro; Shiba, Masatsugu; Takeshita, Tohru; Fukuda, Daiju; Miki, Yukio.

Lancet Digit Health ; 5(8): e525-e533, 2023 08.

Artigo em Inglês | MEDLINE | ID: mdl-37422342

RESUMO

BACKGROUND: Chest radiography is a common and widely available examination. Although cardiovascular structures-such as cardiac shadows and vessels-are visible on chest radiographs, the ability of these radiographs to estimate cardiac function and valvular disease is poorly understood. Using datasets from multiple institutions, we aimed to develop and validate a deep-learning model to simultaneously detect valvular disease and cardiac functions from chest radiographs. METHODS: In this model development and validation study, we trained, validated, and externally tested a deep learning-based model to classify left ventricular ejection fraction, tricuspid regurgitant velocity, mitral regurgitation, aortic stenosis, aortic regurgitation, mitral stenosis, tricuspid regurgitation, pulmonary regurgitation, and inferior vena cava dilation from chest radiographs. The chest radiographs and associated echocardiograms were collected from four institutions between April 1, 2013, and Dec 31, 2021: we used data from three sites (Osaka Metropolitan University Hospital, Osaka, Japan; Habikino Medical Center, Habikino, Japan; and Morimoto Hospital, Osaka, Japan) for training, validation, and internal testing, and data from one site (Kashiwara Municipal Hospital, Kashiwara, Japan) for external testing. We evaluated the area under the receiver operating characteristic curve (AUC), sensitivity, specificity, and accuracy. FINDINGS: We included 22 551 radiographs associated with 22 551 echocardiograms obtained from 16 946 patients. The external test dataset featured 3311 radiographs from 2617 patients with a mean age of 72 years [SD 15], of whom 49·8% were male and 50·2% were female. The AUCs, accuracy, sensitivity, and specificity for this dataset were 0·92 (95% CI 0·90-0·95), 86% (85-87), 82% (75-87), and 86% (85-88) for classifying the left ventricular ejection fraction at a 40% cutoff, 0·85 (0·83-0·87), 75% (73-76), 83% (80-87), and 73% (71-75) for classifying the tricuspid regurgitant velocity at a 2·8 m/s cutoff, 0·89 (0·86-0·92), 85% (84-86), 82% (76-87), and 85% (84-86) for classifying mitral regurgitation at the none-mild versus moderate-severe cutoff, 0·83 (0·78-0·88), 73% (71-74), 79% (69-87), and 72% (71-74) for classifying aortic stenosis, 0·83 (0·79-0·87), 68% (67-70), 88% (81-92), and 67% (66-69) for classifying aortic regurgitation, 0·86 (0·67-1·00), 90% (89-91), 83% (36-100), and 90% (89-91) for classifying mitral stenosis, 0·92 (0·89-0·94), 83% (82-85), 87% (83-91), and 83% (82-84) for classifying tricuspid regurgitation, 0·86 (0·82-0·90), 69% (68-71), 91% (84-95), and 68% (67-70) for classifying pulmonary regurgitation, and 0·85 (0·81-0·89), 86% (85-88), 73% (65-81), and 87% (86-88) for classifying inferior vena cava dilation. INTERPRETATION: The deep learning-based model can accurately classify cardiac functions and valvular heart diseases using information from digital chest radiographs. This model can classify values typically obtained from echocardiography in a fraction of the time, with low system requirements and the potential to be continuously available in areas where echocardiography specialists are scarce or absent. FUNDING: None.

Assuntos

Doenças das Valvas Cardíacas , Insuficiência da Valva Mitral , Humanos , Masculino , Feminino , Idoso , Estudos Retrospectivos , Inteligência Artificial , Volume Sistólico , Função Ventricular Esquerda , Doenças das Valvas Cardíacas/complicações , Doenças das Valvas Cardíacas/diagnóstico , Insuficiência da Valva Mitral/complicações , Insuficiência da Valva Mitral/diagnóstico por imagem , Radiografia

16.

ChatGPT's Diagnostic Performance from Patient History and Imaging Findings on the Diagnosis Please Quizzes.

Ueda, Daiju; Mitsuyama, Yasuhito; Takita, Hirotaka; Horiuchi, Daisuke; Walston, Shannon L; Tatekawa, Hiroyuki; Miki, Yukio.

Radiology ; 308(1): e231040, 2023 07.

Artigo em Inglês | MEDLINE | ID: mdl-37462501

Assuntos

Currículo , Avaliação Educacional , Humanos , Avaliação Educacional/métodos

17.

Deep learning for pneumothorax diagnosis: a systematic review and meta-analysis.

Sugibayashi, Takahiro; Walston, Shannon L; Matsumoto, Toshimasa; Mitsuyama, Yasuhito; Miki, Yukio; Ueda, Daiju.

Eur Respir Rev ; 32(168)2023 Jun 30.

Artigo em Inglês | MEDLINE | ID: mdl-37286217

RESUMO

BACKGROUND: Deep learning (DL), a subset of artificial intelligence (AI), has been applied to pneumothorax diagnosis to aid physician diagnosis, but no meta-analysis has been performed. METHODS: A search of multiple electronic databases through September 2022 was performed to identify studies that applied DL for pneumothorax diagnosis using imaging. Meta-analysis via a hierarchical model to calculate the summary area under the curve (AUC) and pooled sensitivity and specificity for both DL and physicians was performed. Risk of bias was assessed using a modified Prediction Model Study Risk of Bias Assessment Tool. RESULTS: In 56 of the 63 primary studies, pneumothorax was identified from chest radiography. The total AUC was 0.97 (95% CI 0.96-0.98) for both DL and physicians. The total pooled sensitivity was 84% (95% CI 79-89%) for DL and 85% (95% CI 73-92%) for physicians and the pooled specificity was 96% (95% CI 94-98%) for DL and 98% (95% CI 95-99%) for physicians. More than half of the original studies (57%) had a high risk of bias. CONCLUSIONS: Our review found the diagnostic performance of DL models was similar to that of physicians, although the majority of studies had a high risk of bias. Further pneumothorax AI research is needed.

Assuntos

Aprendizado Profundo , Pneumotórax , Humanos , Pneumotórax/diagnóstico por imagem , Inteligência Artificial , Sensibilidade e Especificidade , Diagnóstico por Imagem

18.

Artificial intelligence-based model for COVID-19 prognosis incorporating chest radiographs and clinical data; a retrospective model development and validation study.

Walston, Shannon L; Matsumoto, Toshimasa; Miki, Yukio; Ueda, Daiju.

Br J Radiol ; 95(1140): 20220058, 2022 Dec 01.

Artigo em Inglês | MEDLINE | ID: mdl-36193755

RESUMO

OBJECTIVES: The purpose of this study was to develop an artificial intelligence-based model to prognosticate COVID-19 patients at admission by combining clinical data and chest radiographs. METHODS: This retrospective study used the Stony Brook University COVID-19 dataset of 1384 inpatients. After exclusions, 1356 patients were randomly divided into training (1083) and test datasets (273). We implemented three artificial intelligence models, which classified mortality, ICU admission, or ventilation risk. Each model had three submodels with different inputs: clinical data, chest radiographs, and both. We showed the importance of the variables using SHapley Additive exPlanations (SHAP) values. RESULTS: The mortality prediction model was best overall with area under the curve, sensitivity, specificity, and accuracy of 0.79 (0.72-0.86), 0.74 (0.68-0.79), 0.77 (0.61-0.88), and 0.74 (0.69-0.79) for the clinical data-based model; 0.77 (0.69-0.85), 0.67 (0.61-0.73), 0.81 (0.67-0.92), 0.70 (0.64-0.75) for the image-based model, and 0.86 (0.81-0.91), 0.76 (0.70-0.81), 0.77 (0.61-0.88), 0.76 (0.70-0.81) for the mixed model. The mixed model had the best performance (p value < 0.05). The radiographs ranked fourth for prognostication overall, and first of the inpatient tests assessed. CONCLUSIONS: These results suggest that prognosis models become more accurate if AI-derived chest radiograph features and clinical data are used together. ADVANCES IN KNOWLEDGE: This AI model evaluates chest radiographs together with clinical data in order to classify patients as having high or low mortality risk. This work shows that chest radiographs taken at admission have significant COVID-19 prognostic information compared to clinical data other than age and sex.

Assuntos

COVID-19 , Humanos , COVID-19/diagnóstico por imagem , Inteligência Artificial , Estudos Retrospectivos , Radiografia , Prognóstico

19.

Development and Validation of Artificial Intelligence-based Method for Diagnosis of Mitral Regurgitation from Chest Radiographs.

Ueda, Daiju; Ehara, Shoichi; Yamamoto, Akira; Iwata, Shinichi; Abo, Koji; Walston, Shannon L; Matsumoto, Toshimasa; Shimazaki, Akitoshi; Yoshiyama, Minoru; Miki, Yukio.

Radiol Artif Intell ; 4(2): e210221, 2022 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-35391769

RESUMO

Purpose: To develop an artificial intelligence-based model to detect mitral regurgitation on chest radiographs. Materials and Methods: This retrospective study included echocardiographs and associated chest radiographs consecutively collected at a single institution between July 2016 and May 2019. Associated radiographs were those obtained within 30 days of echocardiography. These radiographs were labeled as positive or negative for mitral regurgitation on the basis of the echocardiographic reports and were divided into training, validation, and test datasets. An artificial intelligence model was developed by using the training dataset and was tuned by using the validation dataset. To evaluate the model, the area under the curve, sensitivity, specificity, accuracy, positive predictive value, and negative predictive value were assessed by using the test dataset. Results: This study included a total of 10 367 images from 5270 patients. The training dataset included 8240 images (4216 patients), the validation dataset included 1073 images (527 patients), and the test dataset included 1054 images (527 patients). The area under the curve, sensitivity, specificity, accuracy, positive predictive value, and negative predictive value in the test dataset were 0.80 (95% CI: 0.77, 0.82), 71% (95% CI: 67, 75), 74% (95% CI: 70, 77), 73% (95% CI: 70, 75), 68% (95% CI: 64, 72), and 77% (95% CI: 73, 80), respectively. Conclusion: The developed deep learning-based artificial intelligence model may possibly differentiate patients with and without mitral regurgitation by using chest radiographs.Keywords: Computer-aided Diagnosis (CAD), Cardiac, Heart, Valves, Supervised Learning, Convolutional Neural Network (CNN), Deep Learning Algorithms, Machine Learning Algorithms Supplemental material is available for this article. © RSNA, 2022.

20.

Artificial intelligence-based detection of atrial fibrillation from chest radiographs.

Matsumoto, Toshimasa; Ehara, Shoichi; Walston, Shannon L; Mitsuyama, Yasuhito; Miki, Yukio; Ueda, Daiju.

Eur Radiol ; 32(9): 5890-5897, 2022 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-35357542

RESUMO

OBJECTIVE: The purpose of this study was to develop an artificial intelligence (AI)-based model to detect features of atrial fibrillation (AF) on chest radiographs. METHODS: This retrospective study included consecutively collected chest radiographs of patients who had echocardiography at our institution from July 2016 to May 2019. Eligible radiographs had been acquired within 30 days of the echocardiography. These radiographs were labeled as AF-positive or AF-negative based on the associated electronic medical records; then, each patient was randomly divided into training, validation, and test datasets in an 8:1:1 ratio. A deep learning-based model to classify radiographs as with or without AF was trained on the training dataset, tuned with the validation dataset, and evaluated with the test dataset. RESULTS: The training dataset included 11,105 images (5637 patients; 3145 male, mean age ± standard deviation, 68 ± 14 years), the validation dataset included 1388 images (704 patients, 397 male, 67 ± 14 years), and the test dataset included 1375 images (706 patients, 395 male, 68 ± 15 years). Applying the model to the validation and test datasets gave a respective area under the curve of 0.81 (95% confidence interval, 0.78-0.85) and 0.80 (0.76-0.84), sensitivity of 0.76 (0.70-0.81) and 0.70 (0.64-0.76), specificity of 0.75 (0.72-0.77) and 0.74 (0.72-0.77), and accuracy of 0.75 (0.72-0.77) and 0.74 (0.71-0.76). CONCLUSION: Our AI can identify AF on chest radiographs, which provides a new way for radiologists to infer AF. KEY POINTS: â¢ A deep learning-based model was trained to detect atrial fibrillation in chest radiographs, showing that there are indicators of atrial fibrillation visible even on static images. â¢ The validation and test datasets each gave a solid performance with area under the curve, sensitivity, and specificity of 0.81, 0.76, and 0.75, respectively, for the validation dataset, and 0.80, 0.70, and 0.74, respectively, for the test dataset. â¢ The saliency maps highlighted anatomical areas consistent with those reported for atrial fibrillation on chest radiographs, such as the atria.

Assuntos

Inteligência Artificial , Fibrilação Atrial , Aprendizado Profundo , Idoso , Idoso de 80 Anos ou mais , Fibrilação Atrial/diagnóstico por imagem , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Radiografia , Radiografia Torácica/métodos , Estudos Retrospectivos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA