Performance of large language artificial intelligence models on solving restorative dentistry and endodontics student assessments.

Künzle, Paul; Paris, Sebastian

Künzle, Paul; Paris, Sebastian.

Afiliação

Künzle P; Department of Operative, Preventive and Pediatric Dentistry, Charité - Universitätsmedizin Berlin, Aßmannshauser Str. 4-6, Berlin, 14197, Germany. paul.kuenzle@charite.de.
Paris S; Department of Operative, Preventive and Pediatric Dentistry, Charité - Universitätsmedizin Berlin, Aßmannshauser Str. 4-6, Berlin, 14197, Germany.

Clin Oral Investig ; 28(11): 575, 2024 Oct 07.

Article em En | MEDLINE | ID: mdl-39373739

ABSTRACT

ABSTRACT

OBJECTIVES:

The advent of artificial intelligence (AI) and large language model (LLM)-based AI applications (LLMAs) has tremendous implications for our society. This study analyzed the performance of LLMAs on solving restorative dentistry and endodontics (RDE) student assessment questions. MATERIALS AND

METHODS:

151 questions from a RDE question pool were prepared for prompting using LLMAs from OpenAI (ChatGPT-3.5,-4.0 and -4.0o) and Google (Gemini 1.0). Multiple-choice questions were sorted into four question subcategories, entered into LLMAs and answers recorded for analysis. P-value and chi-square statistical analyses were performed using Python 3.9.16.

RESULTS:

The total answer accuracy of ChatGPT-4.0o was the highest, followed by ChatGPT-4.0, Gemini 1.0 and ChatGPT-3.5 (72%, 62%, 44% and 25%, respectively) with significant differences between all LLMAs except GPT-4.0 models. The performance on subcategories direct restorations and caries was the highest, followed by indirect restorations and endodontics.

CONCLUSIONS:

Overall, there are large performance differences among LLMAs. Only the ChatGPT-4 models achieved a success ratio that could be used with caution to support the dental academic curriculum. CLINICAL RELEVANCE While LLMAs could support clinicians to answer dental field-related questions, this capacity depends strongly on the employed model. The most performant model ChatGPT-4.0o achieved acceptable accuracy rates in some subject sub-categories analyzed.

Assuntos

Inteligência Artificial; Endodontia; Humanos; Endodontia/educação; Educação em Odontologia/métodos; Avaliação Educacional/métodos; Estudantes de Odontologia; Dentística Operatória/educação; Competência Clínica; Inquéritos e Questionários

Palavras-chave

Artificial intelligence; ChatGPT; Gemini; GenAI; Natural language processing

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Inteligência Artificial / Endodontia Limite: Humans Idioma: En Revista: Clin Oral Investig / Clin. oral investig / Clinical oral investigations Assunto da revista: ODONTOLOGIA Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Alemanha País de publicação: Alemanha

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google