ABSTRACT
Background: Large-language models (LLMs) driven by artificial intelligence allow people to engage in direct conversations about their health. The accuracy and readability of the answers provided by ChatGPT, the most famous LLM, about Essential Tremor (ET), one of the commonest movement disorders, have not yet been evaluated. Methods: Answers given by ChatGPT to 10 questions about ET were evaluated by 5 professionals and 15 laypeople with a score ranging from 1 (poor) to 5 (excellent) in terms of clarity, relevance, accuracy (only for professionals), comprehensiveness, and overall value of the response. We further calculated the readability of the answers. Results: ChatGPT answers received relatively positive evaluations, with median scores ranging between 4 and 5, by both groups and independently from the type of question. However, there was only moderate agreement between raters, especially in the group of professionals. Moreover, readability levels were poor for all examined answers. Discussion: ChatGPT provided relatively accurate and relevant answers, with some variability as judged by the group of professionals suggesting that the degree of literacy about ET has influenced the ratings and, indirectly, that the quality of information provided in clinical practice is also variable. Moreover, the readability of the answer provided by ChatGPT was found to be poor. LLMs will likely play a significant role in the future; therefore, health-related content generated by these tools should be monitored.
Subject(s)
Comprehension , Essential Tremor , Essential Tremor/diagnosis , Humans , Female , Male , Middle Aged , Aged , Adult , Health LiteracyABSTRACT
BACKGROUND: No tool is currently able to measure digital inclusion in clinical populations suitable for telemedicine. We developed the "Digital Inclusion Questionnaire" (DIQUEST) to estimate access and skills in Parkinson's Disease (PD) patients and verified its properties with a pilot study. METHODS: Thirty PD patients completed the initial version of the DIQUEST along with the Mobile Device Proficiency Questionnaire (MDPQ) and a practical computer task. A Principal Components Analysis (PCA) was conducted to define the DIQUEST factor structure and remove less informative items. We used Cronbach's α to measure internal reliability and Spearman's correlation test to determine the convergent and predictive validity with the MDPQ and the practical task, respectively. RESULTS: The final version of the DIQUEST consisted of 20 items clustering in five components: "advanced skills," "navigation skills," "basic skills/knowledge," "physical access," and "economical access." All components showed high reliability (α > 0.75) as did the entire questionnaire (α = 0.94). Correlation analysis demonstrated high convergent (rho: 0.911; p<0.001) and predictive (rho: 0.807; p<0.001) validity. CONCLUSIONS: We have here presented the development of the DIQUEST as a screening tool to assess the level of digital inclusion, particularly addressing the access and skills domains. Future studies are needed for its validation beyond PD.