Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
1.
BMJ Open Qual ; 13(2)2024 Jun 03.
Artigo em Inglês | MEDLINE | ID: mdl-38830730

RESUMO

BACKGROUND: Manual chart review using validated assessment tools is a standardised methodology for detecting diagnostic errors. However, this requires considerable human resources and time. ChatGPT, a recently developed artificial intelligence chatbot based on a large language model, can effectively classify text based on suitable prompts. Therefore, ChatGPT can assist manual chart reviews in detecting diagnostic errors. OBJECTIVE: This study aimed to clarify whether ChatGPT could correctly detect diagnostic errors and possible factors contributing to them based on case presentations. METHODS: We analysed 545 published case reports that included diagnostic errors. We imputed the texts of case presentations and the final diagnoses with some original prompts into ChatGPT (GPT-4) to generate responses, including the judgement of diagnostic errors and contributing factors of diagnostic errors. Factors contributing to diagnostic errors were coded according to the following three taxonomies: Diagnosis Error Evaluation and Research (DEER), Reliable Diagnosis Challenges (RDC) and Generic Diagnostic Pitfalls (GDP). The responses on the contributing factors from ChatGPT were compared with those from physicians. RESULTS: ChatGPT correctly detected diagnostic errors in 519/545 cases (95%) and coded statistically larger numbers of factors contributing to diagnostic errors per case than physicians: DEER (median 5 vs 1, p<0.001), RDC (median 4 vs 2, p<0.001) and GDP (median 4 vs 1, p<0.001). The most important contributing factors of diagnostic errors coded by ChatGPT were 'failure/delay in considering the diagnosis' (315, 57.8%) in DEER, 'atypical presentation' (365, 67.0%) in RDC, and 'atypical presentation' (264, 48.4%) in GDP. CONCLUSION: ChatGPT accurately detects diagnostic errors from case presentations. ChatGPT may be more sensitive than manual reviewing in detecting factors contributing to diagnostic errors, especially for 'atypical presentation'.


Assuntos
Erros de Diagnóstico , Humanos , Erros de Diagnóstico/estatística & dados numéricos , Inteligência Artificial/normas
2.
JMIR Res Protoc ; 13: e56933, 2024 Mar 25.
Artigo em Inglês | MEDLINE | ID: mdl-38526541

RESUMO

BACKGROUND: Atypical presentations have been increasingly recognized as a significant contributing factor to diagnostic errors in internal medicine. However, research to address associations between atypical presentations and diagnostic errors has not been evaluated due to the lack of widely applicable definitions and criteria for what is considered an atypical presentation. OBJECTIVE: The aim of the study is to describe how atypical presentations are defined and measured in studies of diagnostic errors in internal medicine and use this new information to develop new criteria to identify atypical presentations at high risk for diagnostic errors. METHODS: This study will follow an established framework for conducting scoping reviews. Inclusion criteria are developed according to the participants, concept, and context framework. This review will consider studies that fulfill all of the following criteria: include adult patients (participants); explore the association between atypical presentations and diagnostic errors using any definition, criteria, or measurement to identify atypical presentations and diagnostic errors (concept); and focus on internal medicine (context). Regarding the type of sources, this scoping review will consider quantitative, qualitative, and mixed methods study designs; systematic reviews; and opinion papers for inclusion. Case reports, case series, and conference abstracts will be excluded. The data will be extracted through MEDLINE, Web of Science, CINAHL, Embase, Cochrane Library, and Google Scholar searches. No limits will be applied to language, and papers indexed from database inception to December 31, 2023, will be included. Two independent reviewers (YH and RK) will conduct study selection and data extraction. The data extracted will include specific details about the patient characteristics (eg, age, sex, and disease), the definitions and measuring methods for atypical presentations and diagnostic errors, clinical settings (eg, department and outpatient or inpatient), type of evidence source, and the association between atypical presentations and diagnostic errors relevant to the review question. The extracted data will be presented in tabular format with descriptive statistics, allowing us to identify the key components or types of atypical presentations and develop new criteria to identify atypical presentations for future studies of diagnostic errors. Developing the new criteria will follow guidance for a basic qualitative content analysis with an inductive approach. RESULTS: As of January 2024, a literature search through multiple databases is ongoing. We will complete this study by December 2024. CONCLUSIONS: This scoping review aims to provide rigorous evidence to develop new criteria to identify atypical presentations at high risk for diagnostic errors in internal medicine. Such criteria could facilitate the development of a comprehensive conceptual model to understand the associations between atypical presentations and diagnostic errors in internal medicine. TRIAL REGISTRATION: Open Science Framework; www.osf.io/27d5m. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): DERR1-10.2196/56933.

3.
JMIR Med Inform ; 11: e48808, 2023 Oct 09.
Artigo em Inglês | MEDLINE | ID: mdl-37812468

RESUMO

BACKGROUND: The diagnostic accuracy of differential diagnoses generated by artificial intelligence chatbots, including ChatGPT models, for complex clinical vignettes derived from general internal medicine (GIM) department case reports is unknown. OBJECTIVE: This study aims to evaluate the accuracy of the differential diagnosis lists generated by both third-generation ChatGPT (ChatGPT-3.5) and fourth-generation ChatGPT (ChatGPT-4) by using case vignettes from case reports published by the Department of GIM of Dokkyo Medical University Hospital, Japan. METHODS: We searched PubMed for case reports. Upon identification, physicians selected diagnostic cases, determined the final diagnosis, and displayed them into clinical vignettes. Physicians typed the determined text with the clinical vignettes in the ChatGPT-3.5 and ChatGPT-4 prompts to generate the top 10 differential diagnoses. The ChatGPT models were not specially trained or further reinforced for this task. Three GIM physicians from other medical institutions created differential diagnosis lists by reading the same clinical vignettes. We measured the rate of correct diagnosis within the top 10 differential diagnosis lists, top 5 differential diagnosis lists, and the top diagnosis. RESULTS: In total, 52 case reports were analyzed. The rates of correct diagnosis by ChatGPT-4 within the top 10 differential diagnosis lists, top 5 differential diagnosis lists, and top diagnosis were 83% (43/52), 81% (42/52), and 60% (31/52), respectively. The rates of correct diagnosis by ChatGPT-3.5 within the top 10 differential diagnosis lists, top 5 differential diagnosis lists, and top diagnosis were 73% (38/52), 65% (34/52), and 42% (22/52), respectively. The rates of correct diagnosis by ChatGPT-4 were comparable to those by physicians within the top 10 (43/52, 83% vs 39/52, 75%, respectively; P=.47) and within the top 5 (42/52, 81% vs 35/52, 67%, respectively; P=.18) differential diagnosis lists and top diagnosis (31/52, 60% vs 26/52, 50%, respectively; P=.43) although the difference was not significant. The ChatGPT models' diagnostic accuracy did not significantly vary based on open access status or the publication date (before 2011 vs 2022). CONCLUSIONS: This study demonstrates the potential diagnostic accuracy of differential diagnosis lists generated using ChatGPT-3.5 and ChatGPT-4 for complex clinical vignettes from case reports published by the GIM department. The rate of correct diagnoses within the top 10 and top 5 differential diagnosis lists generated by ChatGPT-4 exceeds 80%. Although derived from a limited data set of case reports from a single department, our findings highlight the potential utility of ChatGPT-4 as a supplementary tool for physicians, particularly for those affiliated with the GIM department. Further investigations should explore the diagnostic accuracy of ChatGPT by using distinct case materials beyond its training data. Such efforts will provide a comprehensive insight into the role of artificial intelligence in enhancing clinical decision-making.

4.
JMIR Form Res ; 7: e49034, 2023 Aug 02.
Artigo em Inglês | MEDLINE | ID: mdl-37531164

RESUMO

BACKGROUND: Low diagnostic accuracy is a major concern in automated medical history-taking systems with differential diagnosis (DDx) generators. Extending the concept of collective intelligence to the field of DDx generators such that the accuracy of judgment becomes higher when accepting an integrated diagnosis list from multiple people than when accepting a diagnosis list from a single person may be a possible solution. OBJECTIVE: The purpose of this study is to assess whether the combined use of several DDx generators improves the diagnostic accuracy of DDx lists. METHODS: We used medical history data and the top 10 DDx lists (index DDx lists) generated by an artificial intelligence (AI)-driven automated medical history-taking system from 103 patients with confirmed diagnoses. Two research physicians independently created the other top 10 DDx lists (second and third DDx lists) per case by imputing key information into the other 2 DDx generators based on the medical history generated by the automated medical history-taking system without reading the index lists generated by the automated medical history-taking system. We used the McNemar test to assess the improvement in diagnostic accuracy from the index DDx lists to the three types of combined DDx lists: (1) simply combining DDx lists from the index, second, and third lists; (2) creating a new top 10 DDx list using a 1/n weighting rule; and (3) creating new lists with only shared diagnoses among DDx lists from the index, second, and third lists. We treated the data generated by 2 research physicians from the same patient as independent cases. Therefore, the number of cases included in analyses in the case using 2 additional lists was 206 (103 cases × 2 physicians' input). RESULTS: The diagnostic accuracy of the index lists was 46% (47/103). Diagnostic accuracy was improved by simply combining the other 2 DDx lists (133/206, 65%, P<.001), whereas the other 2 combined DDx lists did not improve the diagnostic accuracy of the DDx lists (106/206, 52%, P=.05 in the collective list with the 1/n weighting rule and 29/206, 14%, P<.001 in the only shared diagnoses among the 3 DDx lists). CONCLUSIONS: Simply adding each of the top 10 DDx lists from additional DDx generators increased the diagnostic accuracy of the DDx list by approximately 20%, suggesting that the combinational use of DDx generators early in the diagnostic process is beneficial.

5.
Int J Gen Med ; 16: 1295-1302, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37081930

RESUMO

Purpose: The general internal medicine (GIM) department can be an effective diagnostic coordinator for undiagnosed outpatients. We investigated the contribution of GIM consultations to the diagnosis of patients admitted to specialty departments in hospitals in Japan that have not yet adopted a hospitalist system. Patients and Methods: This single-center, retrospective observational study was conducted at a university hospital in Japan. GIM consultations from other departments on inpatients aged ≥20 years, from April 2016 to March 2021, were included. Data were extracted from electronic medical records, and consultation purposes were categorized into diagnosis, treatment, and diagnosis and treatment. The primary outcome was new diagnosis during hospitalization for patients with consultation purpose of diagnosis or diagnosis and treatment. The secondary outcomes were the purposes of consultation with the Diagnostic and Generalist Medicine department. Results: In total, 342 patients were included in the analysis. The purpose of the consultations was diagnosis for 253 patients (74%), treatment for 60 (17.5%), and diagnosis and treatment for 29 patients (8.5%). In 282 consultations for diagnosis and diagnosis and treatment, 179 new diagnoses were established for 162 patients (57.5%, 95% confidence interval [CI], 51.5-63.3). Conclusion: The GIM department can function as a diagnostic consultant for inpatients with diagnostic problems admitted to other specialty departments in hospitals where hospitalist or other similar systems are not adopted.

6.
Eur J Case Rep Intern Med ; 10(3): 003823, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36969525

RESUMO

Introduction: Epipericardial fat necrosis (EFN) is a relatively rare benign disease that causes acute chest pain. Case Description: A woman in her 20s presented with acute left shoulder and epigastric pain. One day before presentation, she had visited a cardiologist and an acute coronary syndrome had been ruled out. The pain worsened with deep inspiration. Chest computed tomography (CT) showed a soft-tissue attenuation lesion containing a fatty centre located in the epipericardial fat at the left cardiophrenic angle. Hence, EFN was diagnosed and the pain was resolved with loxoprofen. The lesion had disappeared on a follow-up chest CT scan. Discussion: EFN is a rare benign disease that causes acute chest pain. Approximately 70-90% of EFN cases are misdiagnosed by clinicians as other diseases. Conclusion: In patients with acute chest pain, the correct diagnosis of EFN avoids unnecessary invasive investigations and reassures patients. LEARNING POINTS: Patients with epipericardial fat necrosis typically present with acute pleural chest pain without any associated symptoms.Characteristic CT findings of the encapsulated fatty pericardial lesion with a surrounding inflammatory reaction are key for the diagnosis of epipericardial fat necrosis.The correct diagnosis of epipericardial fat necrosis in patients with acute chest pain reassures them and avoids unnecessary invasive investigation.

7.
Artigo em Inglês | MEDLINE | ID: mdl-36834073

RESUMO

The diagnostic accuracy of differential diagnoses generated by artificial intelligence (AI) chatbots, including the generative pretrained transformer 3 (GPT-3) chatbot (ChatGPT-3) is unknown. This study evaluated the accuracy of differential-diagnosis lists generated by ChatGPT-3 for clinical vignettes with common chief complaints. General internal medicine physicians created clinical cases, correct diagnoses, and five differential diagnoses for ten common chief complaints. The rate of correct diagnosis by ChatGPT-3 within the ten differential-diagnosis lists was 28/30 (93.3%). The rate of correct diagnosis by physicians was still superior to that by ChatGPT-3 within the five differential-diagnosis lists (98.3% vs. 83.3%, p = 0.03). The rate of correct diagnosis by physicians was also superior to that by ChatGPT-3 in the top diagnosis (53.3% vs. 93.3%, p < 0.001). The rate of consistent differential diagnoses among physicians within the ten differential-diagnosis lists generated by ChatGPT-3 was 62/88 (70.5%). In summary, this study demonstrates the high diagnostic accuracy of differential-diagnosis lists generated by ChatGPT-3 for clinical cases with common chief complaints. This suggests that AI chatbots such as ChatGPT-3 can generate a well-differentiated diagnosis list for common chief complaints. However, the order of these lists can be improved in the future.


Assuntos
Inteligência Artificial , Clínicos Gerais , Humanos , Diagnóstico Diferencial , Projetos Piloto , Software
8.
JMIR Med Inform ; 10(1): e35225, 2022 Jan 27.
Artigo em Inglês | MEDLINE | ID: mdl-35084347

RESUMO

BACKGROUND: Automated medical history-taking systems that generate differential diagnosis lists have been suggested to contribute to improved diagnostic accuracy. However, the effect of these systems on diagnostic errors in clinical practice remains unknown. OBJECTIVE: This study aimed to assess the incidence of diagnostic errors in an outpatient department, where an artificial intelligence (AI)-driven automated medical history-taking system that generates differential diagnosis lists was implemented in clinical practice. METHODS: We conducted a retrospective observational study using data from a community hospital in Japan. We included patients aged 20 years and older who used an AI-driven, automated medical history-taking system that generates differential diagnosis lists in the outpatient department of internal medicine for whom the index visit was between July 1, 2019, and June 30, 2020, followed by unplanned hospitalization within 14 days. The primary endpoint was the incidence of diagnostic errors, which were detected using the Revised Safer Dx Instrument by at least two independent reviewers. To evaluate the effect of differential diagnosis lists from the AI system on the incidence of diagnostic errors, we compared the incidence of these errors between a group where the AI system generated the final diagnosis in the differential diagnosis list and a group where the AI system did not generate the final diagnosis in the list; the Fisher exact test was used for comparison between these groups. For cases with confirmed diagnostic errors, further review was conducted to identify the contributing factors of these errors via discussion among three reviewers, using the Safer Dx Process Breakdown Supplement as a reference. RESULTS: A total of 146 patients were analyzed. A final diagnosis was confirmed for 138 patients and was observed in the differential diagnosis list from the AI system for 69 patients. Diagnostic errors occurred in 16 out of 146 patients (11.0%, 95% CI 6.4%-17.2%). Although statistically insignificant, the incidence of diagnostic errors was lower in cases where the final diagnosis was included in the differential diagnosis list from the AI system than in cases where the final diagnosis was not included in the list (7.2% vs 15.9%, P=.18). CONCLUSIONS: The incidence of diagnostic errors among patients in the outpatient department of internal medicine who used an automated medical history-taking system that generates differential diagnosis lists seemed to be lower than the previously reported incidence of diagnostic errors. This result suggests that the implementation of an automated medical history-taking system that generates differential diagnosis lists could be beneficial for diagnostic safety in the outpatient department of internal medicine.

9.
Healthcare (Basel) ; 9(9)2021 Sep 03.
Artigo em Inglês | MEDLINE | ID: mdl-34574924

RESUMO

This study aimed to investigate consultation outcomes from gastroenterologists to generalist physicians for the diagnostic workup of undiagnosed chronic abdominal pain. This was a single-center, retrospective, descriptive study. Patients were included who were ≥15 years old and consulted from the Department of Gastroenterology to the Department of Diagnostic Medicine, to establish a diagnosis for chronic abdominal pain, at the Dokkyo University Hospital from 1 April 2016 to 31 August 2020. We retrospectively reviewed the patients' medical charts and extracted data. A total of 12 cases were included. Eight patients (66.7%) were diagnosed with and treated for functional gastrointestinal disorders (FGID) at the Department of Gastroenterology; their lack of improvement under treatment for FGID was the reason for their referral to the Department of Diagnostic Medicine for further examination. After this consultation, new possible diagnoses were generated for eight patients (66.7%). Six of the eight patients (75.0%) were diagnosed with abdominal wall pain (anterior cutaneous nerve entrapment syndrome, n = 3; myofascial pain, n = 1; falciform pain, n = 1; and herpes zoster non-herpeticus; n = 1). Consultation referral from gastroenterologists to generalists could generate new possible diagnoses in approximately 70% of patients with undiagnosed chronic abdominal pain.

10.
Artigo em Inglês | MEDLINE | ID: mdl-34070958

RESUMO

A diagnostic decision support system (DDSS) is expected to reduce diagnostic errors. However, its effect on physicians' diagnostic decisions remains unclear. Our study aimed to assess the prevalence of diagnoses from artificial intelligence (AI) in physicians' differential diagnoses when using AI-driven DDSS that generates a differential diagnosis from the information entered by the patient before the clinical encounter on physicians' differential diagnoses. In this randomized controlled study, an exploratory analysis was performed. Twenty-two physicians were required to generate up to three differential diagnoses per case by reading 16 clinical vignettes. The participants were divided into two groups, an intervention group, and a control group, with and without a differential diagnosis list of AI, respectively. The prevalence of physician diagnosis identical with the differential diagnosis of AI (primary outcome) was significantly higher in the intervention group than in the control group (70.2% vs. 55.1%, p < 0.001). The primary outcome was significantly >10% higher in the intervention group than in the control group, except for attending physicians, and physicians who did not trust AI. This study suggests that at least 15% of physicians' differential diagnoses were affected by the differential diagnosis list in the AI-driven DDSS.


Assuntos
Inteligência Artificial , Médicos , Diagnóstico Diferencial , Erros de Diagnóstico , Humanos , Confiança
11.
J Med Case Rep ; 15(1): 256, 2021 May 21.
Artigo em Inglês | MEDLINE | ID: mdl-34016176

RESUMO

BACKGROUND: The incidence of colorectal cancer in persons aged < 50 years has been increasing. The diagnosis of colorectal cancer is not difficult if the patient has typical symptoms; however, diagnosis may be difficult in cases with atypical symptoms and signs. We present here an atypical case of metastatic colorectal cancer with fever and sudden onset paraplegia as the sole manifestations. The patient had multiple osteolytic lesions without gastrointestinal symptoms or signs, which resulted in a diagnostic delay of colorectal cancer. CASE PRESENTATION: A 46-year-old Japanese man was transferred to our hospital for evaluation of fever. He had developed fever 8 weeks previously and had been first admitted to another hospital 5 weeks ago. The patient was initially placed on antibiotics based on the suspicion of a bacterial infection. During the hospital stay, the patient experienced a sudden onset of paralysis and numbness in his both legs. Magnetic resonance imaging showed an epidural mass at the level of Th11, and the patient underwent a laminectomy. Epidural abscess and vertebral osteomyelitis were suspected, and antimicrobial treatment was continued. However, his fever persisted, and he was transferred to our hospital. Chest, abdominal, and pelvic computed tomography (CT) with contrast showed diffusely distributed osteolytic lesions. Fluorodeoxyglucose-positron-emission tomography showed high fluorodeoxyglucose accumulation in multiple discrete bone structures; however, no significant accumulation was observed in the solid organs or lymph nodes. A CT-guided bone biopsy obtained from the left iliac bone confirmed the evidence of metastatic adenocarcinoma based on immunohistochemistry. A subsequent colonoscopy showed a Borrmann type II tumor in the sigmoid colon, which was confirmed to be a poorly differentiated adenocarcinoma. As a result of shared decision-making, the patient chose palliative care. CONCLUSIONS: Although rare, osteolytic bone metastases as the sole manifestation can occur in patients with colorectal cancer. In patients with conditions difficult to diagnose, physicians should prioritize the necessary tests based on differential diagnoses by analytical clinical reasoning, taking into consideration the patient's clinical manifestation and the disease epidemiology. Bone biopsies are usually needed in patients only with sole osteolytic bone lesions; however, other rapid and useful non-invasive diagnostic tests can be also useful for narrowing the differential diagnosis.


Assuntos
Adenocarcinoma , Neoplasias Ósseas , Neoplasias Colorretais , Neoplasias Ósseas/diagnóstico por imagem , Neoplasias Colorretais/complicações , Neoplasias Colorretais/diagnóstico , Diagnóstico Tardio , Fluordesoxiglucose F18 , Humanos , Masculino , Pessoa de Meia-Idade
12.
Artigo em Inglês | MEDLINE | ID: mdl-33669930

RESUMO

BACKGROUND: The efficacy of artificial intelligence (AI)-driven automated medical-history-taking systems with AI-driven differential-diagnosis lists on physicians' diagnostic accuracy was shown. However, considering the negative effects of AI-driven differential-diagnosis lists such as omission (physicians reject a correct diagnosis suggested by AI) and commission (physicians accept an incorrect diagnosis suggested by AI) errors, the efficacy of AI-driven automated medical-history-taking systems without AI-driven differential-diagnosis lists on physicians' diagnostic accuracy should be evaluated. OBJECTIVE: The present study was conducted to evaluate the efficacy of AI-driven automated medical-history-taking systems with or without AI-driven differential-diagnosis lists on physicians' diagnostic accuracy. METHODS: This randomized controlled study was conducted in January 2021 and included 22 physicians working at a university hospital. Participants were required to read 16 clinical vignettes in which the AI-driven medical history of real patients generated up to three differential diagnoses per case. Participants were divided into two groups: with and without an AI-driven differential-diagnosis list. RESULTS: There was no significant difference in diagnostic accuracy between the two groups (57.4% vs. 56.3%, respectively; p = 0.91). Vignettes that included a correct diagnosis in the AI-generated list showed the greatest positive effect on physicians' diagnostic accuracy (adjusted odds ratio 7.68; 95% CI 4.68-12.58; p < 0.001). In the group with AI-driven differential-diagnosis lists, 15.9% of diagnoses were omission errors and 14.8% were commission errors. CONCLUSIONS: Physicians' diagnostic accuracy using AI-driven automated medical history did not differ between the groups with and without AI-driven differential-diagnosis lists.


Assuntos
Inteligência Artificial , Médicos , Diagnóstico Diferencial , Humanos , Inteligência , Anamnese
13.
Eur J Case Rep Intern Med ; 8(1): 002207, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33585340

RESUMO

We report a case of delayed diagnosis of cholangiocarcinoma. A 62-year-old man developed acute abdominal pain in multiple sites. As the distribution pattern of the abdominal pain was not correctly interpreted based on the mechanisms of visceral and referred pain, the patient was not investigated with the best diagnostic test at first presentation. Moreover, miscommunication between physicians in a clinic and separate hospital delayed diagnosis. For prompt diagnosis, physicians should be practice careful reasoning and focus on good communication with physicians outside their hospital. LEARNING POINTS: Abdominal pain without jaundice can be an initial symptom in patients with cholangiocarcinoma.Cholangiocarcinoma in the lower common bile duct can present as lower abdominal pain referred through the 7th-11th thoracic nerves.Physicians can determine the origin of abdominal pain through correct interpretation of the distribution pattern of abdominal pain based on knowledge of pathophysiology.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...