Search | VHL Regional Portal

Development and Evaluation of Aeyeconsult: A Novel Ophthalmology Chatbot Leveraging Verified Textbook Knowledge and GPT-4.

Singer, Maxwell B; Fu, Julia J; Chow, Jessica; Teng, Christopher C.

J Surg Educ ; 81(3): 438-443, 2024 Mar.

Article in English | MEDLINE | ID: mdl-38135548

ABSTRACT

OBJECTIVE: There has been much excitement on the use of large language models (LLMs) such as ChatGPT in ophthalmology. However, LLMs are limited in that they are trained on unverified information and do not cite their sources. This paper highlights a new methodology to create a generative AI chatbot to answer eye care related questions which uses only verified ophthalmology textbooks as data and cites its sources. SETTING: Yale School of Medicine Department of Ophthalmology and Visual Science. DESIGN/METHODS: Aeyeconsult, an ophthalmology chatbot, was developed using GPT-4 (the LLM used to power the publicly available chatbot ChatGPT-4), LangChain, and Pinecone. Ophthalmology textbooks were processed into embeddings and stored in Pinecone. User queries were similarly converted, compared to stored embeddings, and GPT-4 generated responses. The interface was adapted from public code. Both Aeyeconsult and ChatGPT-4 were tested on the same 260 questions from OphthoQuestions.com, with the first response from Aeyeconsult and ChatGPT-4 recorded as the answer. RESULTS: Aeyeconsult outperformed ChatGPT-4 on the OKAP dataset, with 83.4% correct answers compared to 69.2% (pâ¯=â¯0.0118). Aeyeconsult also had fewer instances of no answer and multiple answers. Both systems performed best in General Medicine, with Aeyeconsult achieving 96.2% accuracy. Aeyeconsult's weakest performance was in Clinical Optics at 68.1%, but it still outperformed ChatGPT-4 in this category (45.5%). CONCLUSION: LLMs may be useful in answering ophthalmology questions but their trustworthiness and accuracy is limited due to training on unverified internet data and lack of source citation. We used a new methodology, using verified ophthalmology textbooks as source material and providing citations, to mitigate these issues, resulting in a chatbot more accurate than ChatGPT-4 in answering OKAPs style questions.

Subject(s)

Internet , Ophthalmology , Schools , Software

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL