Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 309
Filtrar
1.
J Med Internet Res ; 26: e58831, 2024 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-39352738

RESUMO

BACKGROUND: Artificial intelligence and the language models derived from it, such as ChatGPT, offer immense possibilities, particularly in the field of medicine. It is already evident that ChatGPT can provide adequate and, in some cases, expert-level responses to health-related queries and advice for patients. However, it is currently unknown how patients perceive these capabilities, whether they can derive benefit from them, and whether potential risks, such as harmful suggestions, are detected by patients. OBJECTIVE: This study aims to clarify whether patients can get useful and safe health care advice from an artificial intelligence chatbot assistant. METHODS: This cross-sectional study was conducted using 100 publicly available health-related questions from 5 medical specialties (trauma, general surgery, otolaryngology, pediatrics, and internal medicine) from a web-based platform for patients. Responses generated by ChatGPT-4.0 and by an expert panel (EP) of experienced physicians from the aforementioned web-based platform were packed into 10 sets consisting of 10 questions each. The blinded evaluation was carried out by patients regarding empathy and usefulness (assessed through the question: "Would this answer have helped you?") on a scale from 1 to 5. As a control, evaluation was also performed by 3 physicians in each respective medical specialty, who were additionally asked about the potential harm of the response and its correctness. RESULTS: In total, 200 sets of questions were submitted by 64 patients (mean 45.7, SD 15.9 years; 29/64, 45.3% male), resulting in 2000 evaluated answers of ChatGPT and the EP each. ChatGPT scored higher in terms of empathy (4.18 vs 2.7; P<.001) and usefulness (4.04 vs 2.98; P<.001). Subanalysis revealed a small bias in terms of levels of empathy given by women in comparison with men (4.46 vs 4.14; P=.049). Ratings of ChatGPT were high regardless of the participant's age. The same highly significant results were observed in the evaluation of the respective specialist physicians. ChatGPT outperformed significantly in correctness (4.51 vs 3.55; P<.001). Specialists rated the usefulness (3.93 vs 4.59) and correctness (4.62 vs 3.84) significantly lower in potentially harmful responses from ChatGPT (P<.001). This was not the case among patients. CONCLUSIONS: The results indicate that ChatGPT is capable of supporting patients in health-related queries better than physicians, at least in terms of written advice through a web-based platform. In this study, ChatGPT's responses had a lower percentage of potentially harmful advice than the web-based EP. However, it is crucial to note that this finding is based on a specific study design and may not generalize to all health care settings. Alarmingly, patients are not able to independently recognize these potential dangers.


Assuntos
Relações Médico-Paciente , Humanos , Estudos Transversais , Masculino , Feminino , Adulto , Pessoa de Meia-Idade , Inteligência Artificial , Médicos/psicologia , Internet , Empatia , Inquéritos e Questionários
3.
Indian J Palliat Care ; 30(3): 284-287, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39371498

RESUMO

Palliative care plays a crucial role in comprehensive healthcare, yet misconceptions among patients and caregivers hinder access to services. Artificial Intelligence (AI) chatbots offer potential solutions for debunking myths and providing accurate information. This study aims to evaluate the effectiveness of AI chatbots, ChatGPT and Google Gemini, in debunking palliative care myths. Thirty statements reflecting common palliative care misconceptions were compiled. ChatGPT and Google Gemini generated responses to each statement, which were evaluated by a palliative care expert for accuracy. Sensitivity, positive predictive value, accuracy, and precision were calculated to assess chatbot performance. ChatGPT accurately classified 28 out of 30 statements, achieving a true-positive rate of 93.3% and a true-negative rate of 3.3%. Google Gemini achieved perfect accuracy, correctly classifying all 30 statements. Statistical tests showed no significant difference between chatbots' classifications. Both ChatGPT and Google Gemini demonstrated high accuracy in debunking palliative care myths. These findings suggest that AI chatbots have the potential to effectively dispel misconceptions and improve patient education and awareness in palliative care.

4.
JMIR Med Educ ; 10: e57157, 2024 Oct 10.
Artigo em Inglês | MEDLINE | ID: mdl-39388702

RESUMO

Background: The 2024 Nephrology fellowship match data show the declining interest in nephrology in the United States, with an 11% drop in candidates and a mere 66% (321/488) of positions filled. Objective: The study aims to discern the factors influencing this trend using ChatGPT, a leading chatbot model, for insights into the comparative appeal of nephrology versus other internal medicine specialties. Methods: Using the GPT-4 model, the study compared nephrology with 13 other internal medicine specialties, evaluating each on 7 criteria including intellectual complexity, work-life balance, procedural involvement, research opportunities, patient relationships, career demand, and financial compensation. Each criterion was assigned scores from 1 to 10, with the cumulative score determining the ranking. The approach included counteracting potential bias by instructing GPT-4 to favor other specialties over nephrology in reverse scenarios. Results: GPT-4 ranked nephrology only above sleep medicine. While nephrology scored higher than hospice and palliative medicine, it fell short in key criteria such as work-life balance, patient relationships, and career demand. When examining the percentage of filled positions in the 2024 appointment year match, nephrology's filled rate was 66%, only higher than the 45% (155/348) filled rate of geriatric medicine. Nephrology's score decreased by 4%-14% in 5 criteria including intellectual challenge and complexity, procedural involvement, career opportunity and demand, research and academic opportunities, and financial compensation. Conclusions: ChatGPT does not favor nephrology over most internal medicine specialties, highlighting its diminishing appeal as a career choice. This trend raises significant concerns, especially considering the overall physician shortage, and prompts a reevaluation of factors affecting specialty choice among medical residents.


Assuntos
Escolha da Profissão , Medicina Interna , Nefrologia , Pesquisa Qualitativa , Estados Unidos , Humanos , Nefrologia/educação , Medicina Interna/educação , Internato e Residência/estatística & dados numéricos
5.
Artigo em Inglês | MEDLINE | ID: mdl-39277830

RESUMO

INTRODUCTION: The rapid advancement of artificial intelligence (AI), particularly in large language models like ChatGPT and Google's Gemini AI, marks a transformative era in technological innovation. This study explores the potential of AI in ophthalmology, focusing on the capabilities of ChatGPT and Gemini AI. While these models hold promise for medical education and clinical support, their integration requires comprehensive evaluation. This research aims to bridge a gap in the literature by comparing Gemini AI and ChatGPT, assessing their performance against ophthalmology residents using a dataset derived from ophthalmology board exams. METHODS: A dataset comprising 600 questions across 12 subspecialties was curated from Israeli ophthalmology residency exams, encompassing text and image-based formats. Four AI models - ChatGPT-3.5, ChatGPT-4, Gemini, and Gemini Advanced - underwent testing on this dataset. The study includes a comparative analysis with Israeli ophthalmology residents, employing specific metrics for performance assessment. RESULTS: Gemini Advanced demonstrated superior performance with a 66% accuracy rate. Notably, ChatGPT-4 exhibited improvement at 62%, Gemini at 58%, and ChatGPT-3.5 served as the reference at 46%. Comparative analysis with residents offered insights into AI models' performance relative to human-level medical knowledge. Further analysis delved into yearly performance trends, topic-specific variations, and the impact of images on chatbot accuracy. CONCLUSION: The study unveils nuanced AI model capabilities in ophthalmology, emphasizing domain-specific variations. The superior performance of Gemini Advanced superior performance indicates significant advancements, while ChatGPT-4's improvement is noteworthy. Both Gemini and ChatGPT-3.5 demonstrated commendable performance. The comparative analysis underscores AI's evolving role as a supplementary tool in medical education. This research contributes vital insights into AI effectiveness in ophthalmology, highlighting areas for refinement. As AI models evolve, targeted improvements can enhance adaptability across subspecialties, making them valuable tools for medical professionals and enriching patient care. KEY MESSAGES: What is known AI breakthroughs, like ChatGPT and Google's Gemini AI, are reshaping healthcare. In ophthalmology, AI integration has overhauled clinical workflows, particularly in analyzing images for diseases like diabetic retinopathy and glaucoma. What is new This study presents a pioneering comparison between Gemini AI and ChatGPT, evaluating their performance against ophthalmology residents using a meticulously curated dataset derived from real-world ophthalmology board exams. Notably, Gemini Advanced demonstrates superior performance, showcasing substantial advancements, while the evolution of ChatGPT-4 also merits attention. Both models exhibit commendable capabilities. These findings offer crucial insights into the efficacy of AI in ophthalmology, shedding light on areas ripe for further enhancement and optimization.

6.
Adv Exp Med Biol ; 1456: 307-331, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39261436

RESUMO

The chapter provides an in-depth analysis of digital therapeutics (DTx) as a revolutionary approach to managing major depressive disorder (MDD). It discusses the evolution and definition of DTx, their application across various medical fields, regulatory considerations, and their benefits and limitations. This chapter extensively covers DTx for MDD, including smartphone applications, virtual reality interventions, cognitive-behavioral therapy (CBT) platforms, artificial intelligence (AI) and chatbot therapies, biofeedback, wearable technologies, and serious games. It evaluates the effectiveness of these digital interventions, comparing them with traditional treatments and examining patient perspectives, compliance, and engagement. The integration of DTx into clinical practice is also explored, along with the challenges and barriers to their adoption, such as technological limitations, data privacy concerns, ethical considerations, reimbursement issues, and the need for improved digital literacy. This chapter concludes by looking at the future direction of DTx in mental healthcare, emphasizing the need for personalized treatment plans, integration with emerging modalities, and the expansion of access to these innovative solutions globally.


Assuntos
Inteligência Artificial , Terapia Cognitivo-Comportamental , Transtorno Depressivo Maior , Humanos , Transtorno Depressivo Maior/terapia , Terapia Cognitivo-Comportamental/métodos , Telemedicina/tendências , Aplicativos Móveis , Biorretroalimentação Psicológica/métodos , Smartphone , Dispositivos Eletrônicos Vestíveis , Jogos de Vídeo
8.
Cureus ; 16(8): e66155, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-39233945

RESUMO

Dependency on ChatGPT is characterized by excessive reliance on AI-driven conversational agents, such as ChatGPT, in the healthcare sector. This article explores the consequences of overreliance on AI chatbots like ChatGPT in healthcare settings. It discusses the increasing use of AI chatbots for patient consultations, information dissemination, and decision support, highlighting their potential benefits in improving healthcare delivery efficiency and patient outcomes. The editorial explores the factors contributing to ChatGPT Dependency Disorder among healthcare professionals, such as convenience, lack of training, and time constraints, and examines the challenges and benefits associated with integrating AI chatbots in clinical workflows. It emphasizes the importance of maintaining a human-centered approach alongside AI technologies to optimize patient care outcomes and emphasizes the need for responsible integration of AI chatbots in healthcare settings to ensure ethical standards and patient safety. This article concludes by calling for further research and strategies to address ChatGPT Dependency Disorder and promote a balanced approach to leveraging AI technology in healthcare practice.

9.
Inn Med (Heidelb) ; 2024 Sep 26.
Artigo em Alemão | MEDLINE | ID: mdl-39327285

RESUMO

The coronavirus disease 2019 (COVID-19) pandemic emphasized the importance of vaccinations for the prevention of life-threatening diseases and for avoiding the overburdening of the healthcare system. Despite the clear advantage of vaccinations, increasing vaccine hesitancy has been observed worldwide, especially among young people who are potential future parents. Vaccine hesitancy describes the delayed or lack of willingness to utilize recommended vaccinations and represents a substantial challenge for public health. This article analyzes the causes of vaccine hesitancy in the postpandemic period and discusses factors that could make communication successful. The role of artificial intelligence and structured evidence-based discussion techniques, such as the empathetic refutation interview, are emphasized. The aim is to provide practice-oriented recommendations to be able to provide physicians with tools that can help in the education counselling with insecure patients and can promote the acceptance of vaccinations.

10.
Eur Geriatr Med ; 2024 Sep 25.
Artigo em Inglês | MEDLINE | ID: mdl-39320544

RESUMO

INTRODUCTION: Generative Artificial Intelligence (AI) is a technological innovation with wide applicability in daily life, which could help elderly people. However, it raises potential conflicts, such as biases, omissions and errors. METHODS: Descriptive study through the negative stereotypes towards aging questionnaire (CENVE) conducted on chatbots ChatGPT, Gemini, Perplexity, YOUChat, and Copilot was conducted. RESULTS: Of the chatbots studied, three were above 50% in responses with negative stereotypes, Copilot with high ageism level results, followed by Perplexity. In the health section, Copilot was the chatbot with the most negative connotations regarding old age (13 out of 20 points). In the personality section, Copilot scored 14 out of 20, followed by YOUChat. CONCLUSION: The Copilot chatbot responded to the statements more ageistically than the other platforms. These results highlight the importance of addressing any potential biases in AI to ensure that the responses provided are fair and respectful for all potential users.

11.
JMIR Ment Health ; 11: e58493, 2024 Sep 19.
Artigo em Inglês | MEDLINE | ID: mdl-39298759

RESUMO

This article contends that the responsible artificial intelligence (AI) approach-which is the dominant ethics approach ruling most regulatory and ethical guidance-falls short because it overlooks the impact of AI on human relationships. Focusing only on responsible AI principles reinforces a narrow concept of accountability and responsibility of companies developing AI. This article proposes that applying the ethics of care approach to AI regulation can offer a more comprehensive regulatory and ethical framework that addresses AI's impact on human relationships. This dual approach is essential for the effective regulation of AI in the domain of mental health care. The article delves into the emergence of the new "therapeutic" area facilitated by AI-based bots, which operate without a therapist. The article highlights the difficulties involved, mainly the absence of a defined duty of care toward users, and shows how implementing ethics of care can establish clear responsibilities for developers. It also sheds light on the potential for emotional manipulation and the risks involved. In conclusion, the article proposes a series of considerations grounded in the ethics of care for the developmental process of AI-powered therapeutic tools.


Assuntos
Inteligência Artificial , Inteligência Artificial/ética , Humanos , Serviços de Saúde Mental/ética , Serviços de Saúde Mental/legislação & jurisprudência , Saúde Mental/ética
12.
J Med Internet Res ; 26: e55164, 2024 Sep 30.
Artigo em Inglês | MEDLINE | ID: mdl-39348188

RESUMO

BACKGROUND: Family health history (FHx) is an important predictor of a person's genetic risk but is not collected by many adults in the United States. OBJECTIVE: This study aims to test and compare the usability, engagement, and report usefulness of 2 web-based methods to collect FHx. METHODS: This mixed methods study compared FHx data collection using a flow-based chatbot (KIT; the curious interactive test) and a form-based method. KIT's design was optimized to reduce user burden. We recruited and randomized individuals from 2 crowdsourced platforms to 1 of the 2 FHx methods. All participants were asked to complete a questionnaire to assess the method's usability, the usefulness of a report summarizing their experience, user-desired chatbot enhancements, and general user experience. Engagement was studied using log data collected by the methods. We used qualitative findings from analyzing free-text comments to supplement the primary quantitative results. RESULTS: Participants randomized to KIT reported higher usability than those randomized to the form, with a mean System Usability Scale score of 80.2 versus 61.9 (P<.001), respectively. The engagement analysis reflected design differences in the onboarding process. KIT users spent less time entering FHx information and reported more conditions than form users (mean 5.90 vs 7.97 min; P=.04; and mean 7.8 vs 10.1 conditions; P=.04). Both KIT and form users somewhat agreed that the report was useful (Likert scale ratings of 4.08 and 4.29, respectively). Among desired enhancements, personalization was the highest-rated feature (188/205, 91.7% rated medium- to high-priority). Qualitative analyses revealed positive and negative characteristics of both KIT and the form-based method. Among respondents randomized to KIT, most indicated it was easy to use and navigate and that they could respond to and understand user prompts. Negative comments addressed KIT's personality, conversational pace, and ability to manage errors. For KIT and form respondents, qualitative results revealed common themes, including a desire for more information about conditions and a mutual appreciation for the multiple-choice button response format. Respondents also said they wanted to report health information beyond KIT's prompts (eg, personal health history) and for KIT to provide more personalized responses. CONCLUSIONS: We showed that KIT provided a usable way to collect FHx. We also identified design considerations to improve chatbot-based FHx data collection: First, the final report summarizing the FHx collection experience should be enhanced to provide more value for patients. Second, the onboarding chatbot prompt may impact data quality and should be carefully considered. Finally, we highlighted several areas that could be improved by moving from a flow-based chatbot to a large language model implementation strategy.


Assuntos
Anamnese , Humanos , Feminino , Masculino , Anamnese/métodos , Anamnese/estatística & dados numéricos , Adulto , Saúde da Família , Inquéritos e Questionários , Pessoa de Meia-Idade , Coleta de Dados/métodos , Internet
13.
Heliyon ; 10(18): e37238, 2024 Sep 30.
Artigo em Inglês | MEDLINE | ID: mdl-39309898

RESUMO

The integration of AI-based chatbots in language education has garnered significant attention, yet the interplay between chatbots and positive psychology remains underexplored. Filling this gap through a critical analysis of existing theories, measurement scales, and empirical evidence, this paper evaluates the potential benefits and drawbacks of incorporating AI chatbots in language learning environments and how AI chatbots may positively or negatively impact emotional dimensions of language acquisition. The findings unravel that the primary advantages of the AI chatbots are personalized instruction with rapid feedback, a decrease in anxiety levels and a surge in motivation, greater learner independence and self-directed learning, and the fostering of metacognitive abilities. Conversely, the identified obstacles encompass restricted emotional awareness, a deficiency in genuine human interaction, ethical dilemmas and privacy issues, as well as the potential reinforcement of biases and stereotypes. By highlighting the importance of learner emotions in the language learning process, this conceptual analysis review underscores the need for a nuanced understanding of how AI chatbots can support or hinder emotional engagement and motivation. The paper discusses the impacting factors of AI-based chatbots in language education, and strategies for addressing challenges and optimizing chatbot-learner interactions, such as incorporating affective computing techniques and designing culturally-sensitive chatbots. Finally, the article outlines future research directions, emphasizing the need for validated emotion scales in chatbot assisted language learning contexts, longitudinal studies, mixed-methods research, comparative analyses, and investigations into the role of chatbots in fostering emotional intelligence and intercultural competence.

14.
Cureus ; 16(8): e66857, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-39280487

RESUMO

Depression is a prevalent mental health disorder that significantly impacts primary care settings. This editorial explores the potential of artificial intelligence (AI)-powered chatbots in managing depression within primary care environments. AI chatbots offer innovative solutions to challenges faced by healthcare providers, including limited appointment times, delayed access to specialists, and stigma associated with mental health issues. These digital tools provide continuous support, personalized interactions, and early symptom detection, potentially improving accessibility and outcomes in depression management. The integration of AI chatbots in primary care presents opportunities for round-the-clock patient support, personalized interventions, and the reduction of mental health stigma. However, challenges persist, including concerns about assessment accuracy, data privacy, and integration with existing healthcare systems. Successful implementation requires systematic approaches, stakeholder engagement, and comprehensive training for healthcare providers. Ethical considerations, such as ensuring informed consent, managing algorithmic biases, and maintaining the human element in care, are crucial for responsible deployment. As AI technology evolves, future directions may include enhanced natural language processing, multimodal integration, and AI-augmented clinical decision support. This editorial emphasizes the need for a balanced approach that leverages the potential of AI while acknowledging its limitations and the irreplaceable value of human clinical judgment in depression management within primary care settings.

15.
JMIR Ment Health ; 11: e58974, 2024 Sep 09.
Artigo em Inglês | MEDLINE | ID: mdl-39250799

RESUMO

BACKGROUND: The demand for mental health (MH) services in the community continues to exceed supply. At the same time, technological developments make the use of artificial intelligence-empowered conversational agents (CAs) a real possibility to help fill this gap. OBJECTIVE: The objective of this review was to identify existing empathic CA design architectures within the MH care sector and to assess their technical performance in detecting and responding to user emotions in terms of classification accuracy. In addition, the approaches used to evaluate empathic CAs within the MH care sector in terms of their acceptability to users were considered. Finally, this review aimed to identify limitations and future directions for empathic CAs in MH care. METHODS: A systematic literature search was conducted across 6 academic databases to identify journal articles and conference proceedings using search terms covering 3 topics: "conversational agents," "mental health," and "empathy." Only studies discussing CA interventions for the MH care domain were eligible for this review, with both textual and vocal characteristics considered as possible data inputs. Quality was assessed using appropriate risk of bias and quality tools. RESULTS: A total of 19 articles met all inclusion criteria. Most (12/19, 63%) of these empathic CA designs in MH care were machine learning (ML) based, with 26% (5/19) hybrid engines and 11% (2/19) rule-based systems. Among the ML-based CAs, 47% (9/19) used neural networks, with transformer-based architectures being well represented (7/19, 37%). The remaining 16% (3/19) of the ML models were unspecified. Technical assessments of these CAs focused on response accuracies and their ability to recognize, predict, and classify user emotions. While single-engine CAs demonstrated good accuracy, the hybrid engines achieved higher accuracy and provided more nuanced responses. Of the 19 studies, human evaluations were conducted in 16 (84%), with only 5 (26%) focusing directly on the CA's empathic features. All these papers used self-reports for measuring empathy, including single or multiple (scale) ratings or qualitative feedback from in-depth interviews. Only 1 (5%) paper included evaluations by both CA users and experts, adding more value to the process. CONCLUSIONS: The integration of CA design and its evaluation is crucial to produce empathic CAs. Future studies should focus on using a clear definition of empathy and standardized scales for empathy measurement, ideally including expert assessment. In addition, the diversity in measures used for technical assessment and evaluation poses a challenge for comparing CA performances, which future research should also address. However, CAs with good technical and empathic performance are already available to users of MH care services, showing promise for new applications, such as helpline services.


Assuntos
Empatia , Serviços de Saúde Mental , Humanos , Inteligência Artificial
16.
J Med Internet Res ; 26: e49387, 2024 Sep 25.
Artigo em Inglês | MEDLINE | ID: mdl-39320936

RESUMO

BACKGROUND: In recent years, there has been an increase in the use of conversational agents for health promotion and service delivery. To date, health professionals' views on the use of this technology have received limited attention in the literature. OBJECTIVE: The purpose of this study was to gain a better understanding of how health professionals view the use of conversational agents for health care. METHODS: Physicians, nurses, and regulated mental health professionals were recruited using various web-based methods. Participants were interviewed individually using the Zoom (Zoom Video Communications, Inc) videoconferencing platform. Interview questions focused on the potential benefits and risks of using conversational agents for health care, as well as the best way to integrate conversational agents into the health care system. Interviews were transcribed verbatim and uploaded to NVivo (version 12; QSR International, Inc) for thematic analysis. RESULTS: A total of 24 health professionals participated in the study (19 women, 5 men; mean age 42.75, SD 10.71 years). Participants said that the use of conversational agents for health care could have certain benefits, such as greater access to care for patients or clients and workload support for health professionals. They also discussed potential drawbacks, such as an added burden on health professionals (eg, program familiarization) and the limited capabilities of these programs. Participants said that conversational agents could be used for routine or basic tasks, such as screening and assessment, providing information and education, and supporting individuals between appointments. They also said that health professionals should have some oversight in terms of the development and implementation of these programs. CONCLUSIONS: The results of this study provide insight into health professionals' views on the use of conversational agents for health care, particularly in terms of the benefits and drawbacks of these programs and how they should be integrated into the health care system. These collective findings offer useful information and guidance to stakeholders who have an interest in the development and implementation of this technology.


Assuntos
Pessoal de Saúde , Pesquisa Qualitativa , Humanos , Feminino , Masculino , Adulto , Pessoal de Saúde/psicologia , Pessoa de Meia-Idade , Comunicação , Atitude do Pessoal de Saúde , Comunicação por Videoconferência , Atenção à Saúde
17.
JMIR Med Educ ; 10: e59213, 2024 Aug 16.
Artigo em Inglês | MEDLINE | ID: mdl-39150749

RESUMO

BACKGROUND: Although history taking is fundamental for diagnosing medical conditions, teaching and providing feedback on the skill can be challenging due to resource constraints. Virtual simulated patients and web-based chatbots have thus emerged as educational tools, with recent advancements in artificial intelligence (AI) such as large language models (LLMs) enhancing their realism and potential to provide feedback. OBJECTIVE: In our study, we aimed to evaluate the effectiveness of a Generative Pretrained Transformer (GPT) 4 model to provide structured feedback on medical students' performance in history taking with a simulated patient. METHODS: We conducted a prospective study involving medical students performing history taking with a GPT-powered chatbot. To that end, we designed a chatbot to simulate patients' responses and provide immediate feedback on the comprehensiveness of the students' history taking. Students' interactions with the chatbot were analyzed, and feedback from the chatbot was compared with feedback from a human rater. We measured interrater reliability and performed a descriptive analysis to assess the quality of feedback. RESULTS: Most of the study's participants were in their third year of medical school. A total of 1894 question-answer pairs from 106 conversations were included in our analysis. GPT-4's role-play and responses were medically plausible in more than 99% of cases. Interrater reliability between GPT-4 and the human rater showed "almost perfect" agreement (Cohen κ=0.832). Less agreement (κ<0.6) detected for 8 out of 45 feedback categories highlighted topics about which the model's assessments were overly specific or diverged from human judgement. CONCLUSIONS: The GPT model was effective in providing structured feedback on history-taking dialogs provided by medical students. Although we unraveled some limitations regarding the specificity of feedback for certain feedback categories, the overall high agreement with human raters suggests that LLMs can be a valuable tool for medical education. Our findings, thus, advocate the careful integration of AI-driven feedback mechanisms in medical training and highlight important aspects when LLMs are used in that context.


Assuntos
Anamnese , Simulação de Paciente , Estudantes de Medicina , Humanos , Estudos Prospectivos , Anamnese/métodos , Anamnese/normas , Estudantes de Medicina/psicologia , Feminino , Masculino , Competência Clínica/normas , Inteligência Artificial , Retroalimentação , Reprodutibilidade dos Testes , Educação de Graduação em Medicina/métodos
18.
J Pers Med ; 14(8)2024 Aug 19.
Artigo em Inglês | MEDLINE | ID: mdl-39202068

RESUMO

The emergence of digitalization and artificial intelligence has had a profound impact on society, especially in the field of medicine. Digital health is now a reality, with an increasing number of people using chatbots for prognostic or diagnostic purposes, therapeutic planning, and monitoring, as well as for nutritional and mental health support. Initially designed for various purposes, chatbots have demonstrated significant advantages in the medical field, as indicated by multiple sources. However, there are conflicting views in the current literature, with some sources highlighting their drawbacks and limitations, particularly in their use in oncology. This state-of-the-art review article seeks to present both the benefits and the drawbacks of chatbots in the context of medicine and cancer, while also addressing the challenges in their implementation, offering expert insights on the subject.

19.
Ann Hepatol ; 30(1): 101537, 2024 Aug 13.
Artigo em Inglês | MEDLINE | ID: mdl-39147133

RESUMO

INTRODUCTION AND OBJECTIVES: Autoimmune liver diseases (AILDs) are rare and require precise evaluation, which is often challenging for medical providers. Chatbots are innovative solutions to assist healthcare professionals in clinical management. In our study, ten liver specialists systematically evaluated four chatbots to determine their utility as clinical decision support tools in the field of AILDs. MATERIALS AND METHODS: We constructed a 56-question questionnaire focusing on AILD evaluation, diagnosis, and management of Autoimmune Hepatitis (AIH), Primary Biliary Cholangitis (PBC), and Primary Sclerosing Cholangitis (PSC). Four chatbots -ChatGPT 3.5, Claude, Microsoft Copilot, and Google Bard- were presented with the questions in their free tiers in December 2023. Responses underwent critical evaluation by ten liver specialists using a standardized 1 to 10 Likert scale. The analysis included mean scores, the number of highest-rated replies, and the identification of common shortcomings in chatbots performance. RESULTS: Among the assessed chatbots, specialists rated Claude highest with a mean score of 7.37 (SD = 1.91), followed by ChatGPT (7.17, SD = 1.89), Microsoft Copilot (6.63, SD = 2.10), and Google Bard (6.52, SD = 2.27). Claude also excelled with 27 best-rated replies, outperforming ChatGPT (20), while Microsoft Copilot and Google Bard lagged with only 6 and 9, respectively. Common deficiencies included listing details over specific advice, limited dosing options, inaccuracies for pregnant patients, insufficient recent data, over-reliance on CT and MRI imaging, and inadequate discussion regarding off-label use and fibrates in PBC treatment. Notably, internet access for Microsoft Copilot and Google Bard did not enhance precision compared to pre-trained models. CONCLUSIONS: Chatbots hold promise in AILD support, but our study underscores key areas for improvement. Refinement is needed in providing specific advice, accuracy, and focused up-to-date information. Addressing these shortcomings is essential for enhancing the utility of chatbots in AILD management, guiding future development, and ensuring their effectiveness as clinical decision-support tools.

20.
JMIR Med Inform ; 12: e56426, 2024 Aug 08.
Artigo em Inglês | MEDLINE | ID: mdl-39115930

RESUMO

BACKGROUND: Chronic hepatitis B (CHB) imposes substantial economic and social burdens globally. The management of CHB involves intricate monitoring and adherence challenges, particularly in regions like China, where a high prevalence of CHB intersects with health care resource limitations. This study explores the potential of ChatGPT-3.5, an emerging artificial intelligence (AI) assistant, to address these complexities. With notable capabilities in medical education and practice, ChatGPT-3.5's role is examined in managing CHB, particularly in regions with distinct health care landscapes. OBJECTIVE: This study aimed to uncover insights into ChatGPT-3.5's potential and limitations in delivering personalized medical consultation assistance for CHB patients across diverse linguistic contexts. METHODS: Questions sourced from published guidelines, online CHB communities, and search engines in English and Chinese were refined, translated, and compiled into 96 inquiries. Subsequently, these questions were presented to both ChatGPT-3.5 and ChatGPT-4.0 in independent dialogues. The responses were then evaluated by senior physicians, focusing on informativeness, emotional management, consistency across repeated inquiries, and cautionary statements regarding medical advice. Additionally, a true-or-false questionnaire was employed to further discern the variance in information accuracy for closed questions between ChatGPT-3.5 and ChatGPT-4.0. RESULTS: Over half of the responses (228/370, 61.6%) from ChatGPT-3.5 were considered comprehensive. In contrast, ChatGPT-4.0 exhibited a higher percentage at 74.5% (172/222; P<.001). Notably, superior performance was evident in English, particularly in terms of informativeness and consistency across repeated queries. However, deficiencies were identified in emotional management guidance, with only 3.2% (6/186) in ChatGPT-3.5 and 8.1% (15/154) in ChatGPT-4.0 (P=.04). ChatGPT-3.5 included a disclaimer in 10.8% (24/222) of responses, while ChatGPT-4.0 included a disclaimer in 13.1% (29/222) of responses (P=.46). When responding to true-or-false questions, ChatGPT-4.0 achieved an accuracy rate of 93.3% (168/180), significantly surpassing ChatGPT-3.5's accuracy rate of 65.0% (117/180) (P<.001). CONCLUSIONS: In this study, ChatGPT demonstrated basic capabilities as a medical consultation assistant for CHB management. The choice of working language for ChatGPT-3.5 was considered a potential factor influencing its performance, particularly in the use of terminology and colloquial language, and this potentially affects its applicability within specific target populations. However, as an updated model, ChatGPT-4.0 exhibits improved information processing capabilities, overcoming the language impact on information accuracy. This suggests that the implications of model advancement on applications need to be considered when selecting large language models as medical consultation assistants. Given that both models performed inadequately in emotional guidance management, this study highlights the importance of providing specific language training and emotional management strategies when deploying ChatGPT for medical purposes. Furthermore, the tendency of these models to use disclaimers in conversations should be further investigated to understand the impact on patients' experiences in practical applications.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA