Your browser doesn't support javascript.
loading
Comparing Vision-Capable Models, GPT-4 and Gemini, With GPT-3.5 on Taiwan's Pulmonologist Exam.
Chen, Chih-Hsiung; Hsieh, Kuang-Yu; Huang, Kuo-En; Lai, Hsien-Yun.
Afiliação
  • Chen CH; Department of Critical Care Medicine, Mennonite Christian Hospital, Hualien City, TWN.
  • Hsieh KY; Department of Critical Care Medicine, Mennonite Christian Hospital, Hualien City, TWN.
  • Huang KE; Department of Critical Care Medicine, Mennonite Christian Hospital, Hualien City, TWN.
  • Lai HY; Department of Education and Research, Mennonite Christian Hospital, Hualien City, TWN.
Cureus ; 16(8): e67641, 2024 Aug.
Article em En | MEDLINE | ID: mdl-39185287
ABSTRACT
Introduction The latest generation of large language models (LLMs) features multimodal capabilities, allowing them to interpret graphics, images, and videos, which are crucial in medical fields. This study investigates the vision capabilities of the next-generation Generative Pre-trained Transformer 4 (GPT-4) and Google's Gemini. Methods To establish a comparative baseline, we used GPT-3.5, a model limited to text processing, and evaluated the performance of both GPT-4 and Gemini on questions from the Taiwan Specialist Board Exams in Pulmonary and Critical Care Medicine. Our dataset included 1,100 questions from 2012 to 2023, with 100 questions per year. Of these, 1,059 were in pure text and 41 were text with images, with the majority in a non-English language and only six in pure English. Results For each annual exam consisting of 100 questions from 2013 to 2023, GPT-4 achieved scores of 66, 69, 51, 64, 72, 64, 66, 64, 63, 68, and 67, respectively. Gemini scored 45, 48, 45, 45, 46, 59, 54, 41, 53, 45, and 45, while GPT-3.5 scored 39, 33, 35, 36, 32, 33, 43, 28, 32, 33, and 36. Conclusions These results demonstrate that the newer LLMs with vision capabilities significantly outperform the text-only model. When a passing score of 60 was set, GPT-4 passed most exams and approached human performance.
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: Cureus Ano de publicação: 2024 Tipo de documento: Article País de publicação: Estados Unidos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: Cureus Ano de publicação: 2024 Tipo de documento: Article País de publicação: Estados Unidos