RESUMO
Background: The deployment of OpenAI's ChatGPT-3.5 and its subsequent versions, ChatGPT-4 and ChatGPT-4 With Vision (4V; also known as "GPT-4 Turbo With Vision"), has notably influenced the medical field. Having demonstrated remarkable performance in medical examinations globally, these models show potential for educational applications. However, their effectiveness in non-English contexts, particularly in Chile's medical licensing examinations-a critical step for medical practitioners in Chile-is less explored. This gap highlights the need to evaluate ChatGPT's adaptability to diverse linguistic and cultural contexts. Objective: This study aims to evaluate the performance of ChatGPT versions 3.5, 4, and 4V in the EUNACOM (Examen Único Nacional de Conocimientos de Medicina), a major medical examination in Chile. Methods: Three official practice drills (540 questions) from the University of Chile, mirroring the EUNACOM's structure and difficulty, were used to test ChatGPT versions 3.5, 4, and 4V. The 3 ChatGPT versions were provided 3 attempts for each drill. Responses to questions during each attempt were systematically categorized and analyzed to assess their accuracy rate. Results: All versions of ChatGPT passed the EUNACOM drills. Specifically, versions 4 and 4V outperformed version 3.5, achieving average accuracy rates of 79.32% and 78.83%, respectively, compared to 57.53% for version 3.5 (P<.001). Version 4V, however, did not outperform version 4 (P=.73), despite the additional visual capabilities. We also evaluated ChatGPT's performance in different medical areas of the EUNACOM and found that versions 4 and 4V consistently outperformed version 3.5. Across the different medical areas, version 3.5 displayed the highest accuracy in psychiatry (69.84%), while versions 4 and 4V achieved the highest accuracy in surgery (90.00% and 86.11%, respectively). Versions 3.5 and 4 had the lowest performance in internal medicine (52.74% and 75.62%, respectively), while version 4V had the lowest performance in public health (74.07%). Conclusions: This study reveals ChatGPT's ability to pass the EUNACOM, with distinct proficiencies across versions 3.5, 4, and 4V. Notably, advancements in artificial intelligence (AI) have not significantly led to enhancements in performance on image-based questions. The variations in proficiency across medical fields suggest the need for more nuanced AI training. Additionally, the study underscores the importance of exploring innovative approaches to using AI to augment human cognition and enhance the learning process. Such advancements have the potential to significantly influence medical education, fostering not only knowledge acquisition but also the development of critical thinking and problem-solving skills among health care professionals.
Assuntos
Avaliação Educacional , Licenciamento em Medicina , Feminino , Humanos , Masculino , Chile , Competência Clínica/normas , Avaliação Educacional/métodos , Avaliação Educacional/normasRESUMO
Background: Medical students are reluctant to access mental health services, despite having high rates of anxiety and depression. This reluctance persists through residency and into practice. Physicians and trainees who are unwell deliver lower quality patient care, behave less professionally, communicate less effectively and are at an increased risk for burnout and suicide. Little is known about whether students would disclose a mental health diagnosis on a state board medical license application.Objectives: The objectives of this study were to determine whether University of New Mexico School of Medicine (UNM SOM) students would be willing to disclose a mental health diagnosis on a medical licensing application if prompted to do so, and, if not, to identify the reasons for their unwillingness to do so.Design: We electronically invited all UNM SOM students enrolled in the Classes of 2019, 2020, 2021, and 2022 to participate in a confidential RedCap survey about mental health diagnoses and treatment. Four e-mail invitations and reminders were sent to students over a one-month period.Results: Response rate was 50.1%. Thirty-six percent of all respondents considered themselves to have had a mental health condition prior to medical school, and 47% of all respondents perceived a decline in mental health during medical school. The majority of respondents who perceived they had a mental health diagnosis (51%) stated they would not disclose this information on a New Mexico Medical Board (NMMB) license application. Fear of stigmatization, fear of repercussions, and a belief that such disclosure was irrelevant were the top reasons for non-disclosure.Conclusion: Students who perceive themselves to have mental health diagnoses are unlikely to disclose their mental health status on state medical board licensing applications when asked to do so. Addressing barriers to disclosure of mental health diagnoses is necessary for building a healthier physician workforce.