Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 3 de 3
Filter
Add more filters










Database
Language
Publication year range
1.
Cureus ; 16(6): e61680, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38841294

ABSTRACT

Background ChatGPT is a language model that has gained widespread popularity for its fine-tuned conversational abilities. However, a known drawback to the artificial intelligence (AI) chatbot is its tendency to confidently present users with inaccurate information. We evaluated the quality of ChatGPT responses to questions pertaining to atrial fibrillation for patient education. Our analysis included the accuracy and estimated grade level of answers and whether references were provided for the answers. Methodology ChatGPT was prompted four times and 16 frequently asked questions on atrial fibrillation from the American Heart Association were asked. Prompts included Form 1 (no prompt), Form 2 (patient-friendly prompt), Form 3 (physician-level prompt), and Form 4 (prompting for statistics/references). Responses were scored as incorrect, partially correct, or correct with references (perfect). Flesch-Kincaid grade-level unique words and response lengths were recorded for answers. Proportions of the responses at differing scores were compared using the chi-square analysis. The relationship between form and grade level was assessed using the analysis of variance. Results Across all forms, scoring frequencies were one (1.6%) incorrect, five (7.8%) partially correct, 55 (85.9%) correct, and three (4.7%) perfect. Proportions of responses that were at least correct did not differ by form (p = 0.350), but perfect responses did (p = 0.001). Form 2 answers had a lower mean grade level (12.80 ± 3.38) than Forms 1 (14.23 ± 2.34), 3 (16.73 ± 2.65), and 4 (14.85 ± 2.76) (p < 0.05). Across all forms, references were provided in only three (4.7%) answers. Notably, when additionally prompted for sources or references, ChatGPT still only provided sources on three responses out of 16 (18.8%). Conclusions ChatGPT holds significant potential for enhancing patient education through accurate, adaptive responses. Its ability to alter response complexity based on user input, combined with high accuracy rates, supports its use as an informational resource in healthcare settings. Future advancements and continuous monitoring of AI capabilities will be crucial in maximizing the benefits while mitigating the risks associated with AI-driven patient education.

2.
Cureus ; 16(5): e59898, 2024 May.
Article in English | MEDLINE | ID: mdl-38721479

ABSTRACT

Background Google Gemini (Google, Mountain View, CA) represents the latest advances in the realm of artificial intelligence (AI) and has garnered attention due to its capabilities similar to the increasingly popular ChatGPT (OpenAI, San Francisco, CA). Accurate dissemination of information on common conditions such as hypertension is critical for patient comprehension and management. Despite the ubiquity of AI, comparisons between ChatGPT and Gemini remain unexplored. Methods ChatGPT and Gemini were asked 52 questions derived from the American College of Cardiology's (ACC) frequently asked questions on hypertension, following a specified prompt. Prompts included: no prompting (Form 1), patient-friendly prompting (Form 2), physician-level prompting (Form 3), and prompting for statistics/references (Form 4). Responses were scored as incorrect, partially correct, or correct. Flesch-Kincaid (FK) grade level and word count were recorded. Results Across all forms, scoring frequencies were as follows: 23 (5.5%) incorrect, 162 (38.9%) partially correct, and 231 (55.5%) correct. ChatGPT showed higher rates of partially correct answers than Gemini (p = 0.0346). Physician-level prompts resulted in a higher word count across both platforms (p < 0.001). ChatGPT showed a higher FK grade level (p = 0.033) in physician-friendly prompting. Gemini exhibited a significantly higher mean word count (p < 0.001); however, ChatGPT had a higher FK grade level across all forms (p > 0.001). Conclusion To our knowledge, this study is the first to compare cardiology-related responses from ChatGPT and Gemini, two of the most popular AI chatbots. The grade level for most responses was collegiate level, which was above average for the National Institutes of Health (NIH) recommendations, but on par with most online medical information. Both chatbots responded with a high degree of accuracy, with inaccuracies being rare. Therefore, it is reasonable that cardiologists suggest either chatbot as a source of supplementary education.

3.
Cureus ; 16(5): e61067, 2024 May.
Article in English | MEDLINE | ID: mdl-38803402

ABSTRACT

Introduction Hyperlipidemia is prevalent worldwide and affects a significant number of US adults. It significantly contributes to ischemic heart disease and millions of deaths annually. With the increasing use of the internet for health information, tools like ChatGPT (OpenAI, San Francisco, CA, USA) have gained traction. ChatGPT version 4.0, launched in March 2023, offers enhanced features over its predecessor but requires a monthly fee. This study compares the accuracy, comprehensibility, and response length of the free and paid versions of ChatGPT for patient education on hyperlipidemia. Materials and methods ChatGPT versions 3.5 and 4.0 were prompted in three different ways and 25 questions from the Cleveland Clinic's frequently asked questions (FAQs) on hyperlipidemia. Prompts included no prompting (Form 1), patient-friendly prompting (Form 2), and physician-level prompting (Form 3). Responses were categorized as incorrect, partially correct, or correct. Additionally, the grade level and word count from each response were recorded for analysis. Results Overall, scoring frequencies for ChatGPT version 3.5 were: five (6.67%) incorrect, 18 partially correct (24%), and 52 (69.33%) correct. Scoring frequencies for ChatGPT version 4.0 were: one (1.33%) incorrect, 18 (24.00%) partially correct, and 56 (74.67%) correct. Correct answers did not significantly differ between ChatGPT version 3.5 and ChatGPT version 4.0 (p = 0.586). ChatGPT version 3.5 had a significantly higher grade reading level than version 4.0 (p = 0.0002). ChatGPT version 3.5 had a significantly higher word count than version 4.0 (p = 0.0073). Discussion There was no significant difference in accuracy between the free and paid versions of hyperlipidemia FAQs. Both versions provided accurate but sometimes partially complete responses. Version 4.0 offered more concise and readable information, aligning with the readability of most online medical resources despite exceeding the National Institutes of Health's (NIH's) recommended eighth-grade reading level. The paid version demonstrated superior adaptability in tailoring responses based on the input. Conclusion Both versions of ChatGPT provide reliable medical information, with the paid version offering more adaptable and readable responses. Healthcare providers can recommend ChatGPT as a source of patient education, regardless of the version used. Future research should explore diverse question formulations and ChatGPT's handling of incorrect information.

SELECTION OF CITATIONS
SEARCH DETAIL
...