Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 17 de 17
Filter
Add more filters










Publication year range
1.
NPJ Digit Med ; 7(1): 131, 2024 May 18.
Article in English | MEDLINE | ID: mdl-38762669

ABSTRACT

Subjectivity and ambiguity of visual field classification limits the accuracy and reliability of glaucoma diagnosis, prognostication, and management decisions. Standardised rules for classifying glaucomatous visual field defects exist, but these are labour-intensive and therefore impractical for day-to-day clinical work. Here a web-application, Glaucoma Field Defect Classifier (GFDC), for automatic application of Hodapp-Parrish-Anderson, is presented and validated in a cross-sectional study. GFDC exhibits perfect accuracy in classifying mild, moderate, and severe glaucomatous field defects. GFDC may thereby improve the accuracy and fairness of clinical decision-making in glaucoma. The application and its source code are freely hosted online for clinicians and researchers to use with glaucoma patients.

2.
PLOS Digit Health ; 3(4): e0000341, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38630683

ABSTRACT

Large language models (LLMs) underlie remarkable recent advanced in natural language processing, and they are beginning to be applied in clinical contexts. We aimed to evaluate the clinical potential of state-of-the-art LLMs in ophthalmology using a more robust benchmark than raw examination scores. We trialled GPT-3.5 and GPT-4 on 347 ophthalmology questions before GPT-3.5, GPT-4, PaLM 2, LLaMA, expert ophthalmologists, and doctors in training were trialled on a mock examination of 87 questions. Performance was analysed with respect to question subject and type (first order recall and higher order reasoning). Masked ophthalmologists graded the accuracy, relevance, and overall preference of GPT-3.5 and GPT-4 responses to the same questions. The performance of GPT-4 (69%) was superior to GPT-3.5 (48%), LLaMA (32%), and PaLM 2 (56%). GPT-4 compared favourably with expert ophthalmologists (median 76%, range 64-90%), ophthalmology trainees (median 59%, range 57-63%), and unspecialised junior doctors (median 43%, range 41-44%). Low agreement between LLMs and doctors reflected idiosyncratic differences in knowledge and reasoning with overall consistency across subjects and types (p>0.05). All ophthalmologists preferred GPT-4 responses over GPT-3.5 and rated the accuracy and relevance of GPT-4 as higher (p<0.05). LLMs are approaching expert-level knowledge and reasoning skills in ophthalmology. In view of the comparable or superior performance to trainee-grade ophthalmologists and unspecialised junior doctors, state-of-the-art LLMs such as GPT-4 may provide useful medical advice and assistance where access to expert ophthalmologists is limited. Clinical benchmarks provide useful assays of LLM capabilities in healthcare before clinical trials can be designed and conducted.

3.
JMIR Form Res ; 8: e51770, 2024 Jan 25.
Article in English | MEDLINE | ID: mdl-38271088

ABSTRACT

BACKGROUND: Approximately 80% of primary school children in the United States and Europe experience glue ear, which may impair hearing at a critical time for speech acquisition and social development. A web-based app, DigiBel, has been developed primarily to identify individuals with conductive hearing impairment who may benefit from the temporary use of bone-conduction assistive technology in the community. OBJECTIVE: This preliminary study aims to determine the screening accuracy and usability of DigiBel self-assessed air-conduction (AC) pure tone audiometry in adult volunteers with simulated hearing impairment prior to formal clinical validation. METHODS: Healthy adults, each with 1 ear plugged, underwent automated AC pure tone audiometry (reference test) and DigiBel audiometry in quiet community settings. Threshold measurements were compared across 6 tone frequencies and DigiBel test-retest reliability was calculated. The accuracy of DigiBel for detecting more than 20 dB of hearing impairment was assessed. A total of 30 adults (30 unplugged ears and 30 plugged ears) completed both audiometry tests. RESULTS: DigiBel had 100% sensitivity (95% CI 87.23-100) and 72.73% (95% CI 54.48-86.70) specificity in detecting hearing impairment. Threshold mean bias was insignificant except at 4000 and 8000 Hz where a small but significant overestimation of threshold measurement was identified. All 24 participants completing feedback rated the DigiBel test as good or excellent and 21 (88%) participants agreed or strongly agreed that they would be able to do the test at home without help. CONCLUSIONS: This study supports the potential use of DigiBel as a screening tool for hearing impairment. The findings will be used to improve the software further prior to undertaking a formal clinical trial of AC and bone-conduction audiometry in individuals with suspected conductive hearing impairment.

4.
Biomed J ; : 100679, 2023 Dec 02.
Article in English | MEDLINE | ID: mdl-38048990

ABSTRACT

The Metaverse has gained wide attention for being the application interface for the next generation of Internet. The potential of the Metaverse is growing, as Web 3·0 development and adoption continues to advance medicine and healthcare. We define the next generation of interoperable healthcare ecosystem in the Metaverse. We examine the existing literature regarding the Metaverse, explain the technology framework to deliver an immersive experience, along with a technical comparison of legacy and novel Metaverse platforms that are publicly released and in active use. The potential applications of different features of the Metaverse, including avatar-based meetings, immersive simulations, and social interactions are examined with different roles from patients to healthcare providers and healthcare organizations. Present challenges in the development of the Metaverse healthcare ecosystem are discussed, along with potential solutions including capabilities requiring technological innovation, use cases requiring regulatory supervision, and sound governance. This proposed concept and framework of the Metaverse could potentially redefine the traditional healthcare system and enhance digital transformation in healthcare. Similar to AI technology at the beginning of this decade, real-world development and implementation of these capabilities are relatively nascent. Further pragmatic research is needed for the development of an interoperable healthcare ecosystem in the Metaverse.

5.
J Med Internet Res ; 25: e51603, 2023 12 05.
Article in English | MEDLINE | ID: mdl-38051572

ABSTRACT

Large language models (LLMs) are exhibiting remarkable performance in clinical contexts, with exemplar results ranging from expert-level attainment in medical examination questions to superior accuracy and relevance when responding to patient queries compared to real doctors replying to queries on social media. The deployment of LLMs in conventional health care settings is yet to be reported, and there remains an open question as to what evidence should be required before such deployment is warranted. Early validation studies use unvalidated surrogate variables to represent clinical aptitude, and it may be necessary to conduct prospective randomized controlled trials to justify the use of an LLM for clinical advice or assistance, as potential pitfalls and pain points cannot be exhaustively predicted. This viewpoint states that as LLMs continue to revolutionize the field, there is an opportunity to improve the rigor of artificial intelligence (AI) research to reward innovation, conferring real benefits to real patients.


Subject(s)
Aptitude , Artificial Intelligence , Clinical Competence , Humans , Language , Pain , Prospective Studies
6.
J Med Internet Res ; 25: e49949, 2023 10 12.
Article in English | MEDLINE | ID: mdl-37824185

ABSTRACT

Deep learning-based clinical imaging analysis underlies diagnostic artificial intelligence (AI) models, which can match or even exceed the performance of clinical experts, having the potential to revolutionize clinical practice. A wide variety of automated machine learning (autoML) platforms lower the technical barrier to entry to deep learning, extending AI capabilities to clinicians with limited technical expertise, and even autonomous foundation models such as multimodal large language models. Here, we provide a technical overview of autoML with descriptions of how autoML may be applied in education, research, and clinical practice. Each stage of the process of conducting an autoML project is outlined, with an emphasis on ethical and technical best practices. Specifically, data acquisition, data partitioning, model training, model validation, analysis, and model deployment are considered. The strengths and limitations of available code-free, code-minimal, and code-intensive autoML platforms are considered. AutoML has great potential to democratize AI in medicine, improving AI literacy by enabling "hands-on" education. AutoML may serve as a useful adjunct in research by facilitating rapid testing and benchmarking before significant computational resources are committed. AutoML may also be applied in clinical contexts, provided regulatory requirements are met. The abstraction by autoML of arduous aspects of AI engineering promotes prioritization of data set curation, supporting the transition from conventional model-driven approaches to data-centric development. To fulfill its potential, clinicians must be educated on how to apply these technologies ethically, rigorously, and effectively; this tutorial represents a comprehensive summary of relevant considerations.


Subject(s)
Artificial Intelligence , Machine Learning , Humans , Image Processing, Computer-Assisted , Educational Status , Benchmarking
7.
Ophthalmol Sci ; 3(4): 100394, 2023 Dec.
Article in English | MEDLINE | ID: mdl-37885755

ABSTRACT

The rapid progress of large language models (LLMs) driving generative artificial intelligence applications heralds the potential of opportunities in health care. We conducted a review up to April 2023 on Google Scholar, Embase, MEDLINE, and Scopus using the following terms: "large language models," "generative artificial intelligence," "ophthalmology," "ChatGPT," and "eye," based on relevance to this review. From a clinical viewpoint specific to ophthalmologists, we explore from the different stakeholders' perspectives-including patients, physicians, and policymakers-the potential LLM applications in education, research, and clinical domains specific to ophthalmology. We also highlight the foreseeable challenges of LLM implementation into clinical practice, including the concerns of accuracy, interpretability, perpetuating bias, and data security. As LLMs continue to mature, it is essential for stakeholders to jointly establish standards for best practices to safeguard patient safety. Financial Disclosures: Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.

8.
Cell Rep Med ; 4(10): 101230, 2023 10 17.
Article in English | MEDLINE | ID: mdl-37852174

ABSTRACT

Current and future healthcare professionals are generally not trained to cope with the proliferation of artificial intelligence (AI) technology in healthcare. To design a curriculum that caters to variable baseline knowledge and skills, clinicians may be conceptualized as "consumers", "translators", or "developers". The changes required of medical education because of AI innovation are linked to those brought about by evidence-based medicine (EBM). We outline a core curriculum for AI education of future consumers, translators, and developers, emphasizing the links between AI and EBM, with suggestions for how teaching may be integrated into existing curricula. We consider the key barriers to implementation of AI in the medical curriculum: time, resources, variable interest, and knowledge retention. By improving AI literacy rates and fostering a translator- and developer-enriched workforce, innovation may be accelerated for the benefit of patients and practitioners.


Subject(s)
Artificial Intelligence , Education, Medical , Humans , Curriculum , Evidence-Based Medicine/education
9.
Nat Med ; 29(8): 1930-1940, 2023 08.
Article in English | MEDLINE | ID: mdl-37460753

ABSTRACT

Large language models (LLMs) can respond to free-text queries without being specifically trained in the task in question, causing excitement and concern about their use in healthcare settings. ChatGPT is a generative artificial intelligence (AI) chatbot produced through sophisticated fine-tuning of an LLM, and other tools are emerging through similar developmental processes. Here we outline how LLM applications such as ChatGPT are developed, and we discuss how they are being leveraged in clinical settings. We consider the strengths and limitations of LLMs and their potential to improve the efficiency and effectiveness of clinical, educational and research work in medicine. LLM chatbots have already been deployed in a range of biomedical contexts, with impressive but mixed results. This review acts as a primer for interested clinicians, who will determine if and how LLM technology is used in healthcare for the benefit of patients and practitioners.


Subject(s)
Artificial Intelligence , Medicine , Humans , Language , Software , Technology
10.
PLoS One ; 18(6): e0281847, 2023.
Article in English | MEDLINE | ID: mdl-37347757

ABSTRACT

BACKGROUND: Remote self-administered visual acuity (VA) tests have the potential to allow patients and non-specialists to assess vision without eye health professional input. Validation in pragmatic trials is necessary to demonstrate the accuracy and reliability of tests in relevant settings to justify deployment. Here, published pragmatic trials of these tests were synthesised to summarise the effectiveness of available options and appraise the quality of their supporting evidence. METHODS: A systematic review was undertaken in accordance with a preregistered protocol (CRD42022385045). The Cochrane Library, Embase, MEDLINE, and Scopus were searched. Screening was conducted according to the following criteria: (1) English language; (2) primary research article; (3) visual acuity test conducted out of eye clinic; (4) no clinical administration of remote test; (5) accuracy or reliability of remote test analysed. There were no restrictions on trial participants. Quality assessment was conducted with QUADAS-2. RESULTS: Of 1227 identified reports, 10 studies were ultimately included. One study was at high risk of bias and two studies exhibited concerning features of bias; all studies were applicable. Three trials-of DigiVis, iSight Professional, and Peek Acuity-from two studies suggested that accuracy of the remote tests is comparable to clinical assessment. All other trials exhibited inferior accuracy, including conflicting results from a pooled study of iSight Professional and Peek Acuity. Two studies evaluated test-retest agreement-one trial provided evidence that DigiVis is as reliable as clinical assessment. The three most accurate tests required access to digital devices. Reporting was inconsistent and often incomplete, particularly with regards to describing methods and conducting statistical analysis. CONCLUSIONS: Remote self-administered VA tests appear promising, but further pragmatic trials are indicated to justify deployment in carefully defined contexts to facilitate patient or non-specialist led assessment. Deployment could augment teleophthalmology, non-specialist eye assessment, pre-consultation triage, and autonomous long-term monitoring of vision.


Subject(s)
Ophthalmology , Telemedicine , Humans , Reproducibility of Results , Visual Acuity
12.
JMIR Med Educ ; 9: e46599, 2023 Apr 21.
Article in English | MEDLINE | ID: mdl-37083633

ABSTRACT

BACKGROUND: Large language models exhibiting human-level performance in specialized tasks are emerging; examples include Generative Pretrained Transformer 3.5, which underlies the processing of ChatGPT. Rigorous trials are required to understand the capabilities of emerging technology, so that innovation can be directed to benefit patients and practitioners. OBJECTIVE: Here, we evaluated the strengths and weaknesses of ChatGPT in primary care using the Membership of the Royal College of General Practitioners Applied Knowledge Test (AKT) as a medium. METHODS: AKT questions were sourced from a web-based question bank and 2 AKT practice papers. In total, 674 unique AKT questions were inputted to ChatGPT, with the model's answers recorded and compared to correct answers provided by the Royal College of General Practitioners. Each question was inputted twice in separate ChatGPT sessions, with answers on repeated trials compared to gauge consistency. Subject difficulty was gauged by referring to examiners' reports from 2018 to 2022. Novel explanations from ChatGPT-defined as information provided that was not inputted within the question or multiple answer choices-were recorded. Performance was analyzed with respect to subject, difficulty, question source, and novel model outputs to explore ChatGPT's strengths and weaknesses. RESULTS: Average overall performance of ChatGPT was 60.17%, which is below the mean passing mark in the last 2 years (70.42%). Accuracy differed between sources (P=.04 and .06). ChatGPT's performance varied with subject category (P=.02 and .02), but variation did not correlate with difficulty (Spearman ρ=-0.241 and -0.238; P=.19 and .20). The proclivity of ChatGPT to provide novel explanations did not affect accuracy (P>.99 and .23). CONCLUSIONS: Large language models are approaching human expert-level performance, although further development is required to match the performance of qualified primary care physicians in the AKT. Validated high-performance models may serve as assistants or autonomous clinical tools to ameliorate the general practice workforce crisis.

14.
Eye (Lond) ; 36(10): 2057-2061, 2022 10.
Article in English | MEDLINE | ID: mdl-34462579

ABSTRACT

BACKGROUND/OBJECTIVES: Ophthalmic disorders cause 8% of hospital clinic attendances, the highest of any specialty. The fundamental need for a distance visual acuity (VA) measurement constrains remote consultation. A web-application, DigiVis, facilitates self-assessment of VA using two internet-connected devices. This prospective validation study aimed to establish its accuracy, reliability, usability and acceptability. SUBJECTS/METHODS: In total, 120 patients aged 5-87 years (median = 27) self-tested their vision twice using DigiVis in addition to their standard clinical assessment. Eyes with VA worse than +0.80 logMAR were excluded. Accuracy and test-retest (TRT) variability were compared using Bland-Altman analysis and intraclass correlation coefficients (ICC). Patient feedback was analysed. RESULTS: Bias between VA tests was insignificant at -0.001 (95% CI -0.017 to 0.015) logMAR. The upper limit of agreement (LOA) was 0.173 (95% CI 0.146 to 0.201) and the lower LOA -0.175 (95% CI -0.202 to -0.147) logMAR. The ICC was 0.818 (95% CI 0.748 to 0.869). DigiVis TRT mean bias was similarly insignificant, at 0.001 (95% CI -0.011 to 0.013) logMAR, the upper LOA was 0.124 (95% CI 0.103 to 0.144) and the lower LOA -0.121 (95% CI -0.142 to -0.101) logMAR. The ICC was 0.922 (95% CI 0.887 to 0.946). 95% of subjects were willing to use DigiVis to monitor vision at home. CONCLUSIONS: Self-tested distance VA using DigiVis is accurate, reliable and well accepted by patients. The app has potential to facilitate home monitoring, triage and remote consultation but widescale implementation will require integration with NHS databases and secure patient data storage.


Subject(s)
Software , Vision Tests , Humans , Reproducibility of Results , Vision, Ocular , Visual Acuity
16.
BMJ Open Ophthalmol ; 6(1): e000801, 2021.
Article in English | MEDLINE | ID: mdl-34651083

ABSTRACT

OBJECTIVE: The difficulty in accurately assessing distance visual acuity (VA) at home limits the usefulness of remote consultation in ophthalmology. A novel web application, DigiVis, enables automated VA self-assessment using standard digital devices. This study aims to compare its accuracy and reliability in children with clinical assessment by a healthcare professional. METHODS AND ANALYSIS: Children aged 4-10 years were recruited from a paediatric ophthalmology service. Those with VA worse than +0.8 logMAR (Logarithm of the Minimum Angle of Resolution) or with cognitive impairment were excluded. Bland-Altman statistics were used to analyse both the accuracy and repeatability of VA self-testing. User feedback was collected by questionnaire. RESULTS: The left eyes of 89 children (median 7 years) were tested. VA self-testing showed a mean bias of 0.023 logMAR, with a limit of agreement (LOA) of ±0.195 logMAR and an intraclass correlation coefficient (ICC) of 0.816. A second test was possible in 80 (90%) children. Test-retest comparison showed a mean bias of 0.010, with an LOA of ±0.179 logMAR, an ICC of 0.815 and a repeatability coefficient of 0.012. 96% of children rated the test as good or excellent, as did 99% of their parents. CONCLUSION: Digital self-testing gave comparable distance VA assessments with clinical testing in children and was well accepted. Since DigiVis self-testing can be performed under direct supervision using medical video consultation software, it may be a useful tool to enable a proportion of paediatric eye clinic attendances to be moved online, reducing time off school and releasing face-to-face clinical capacity for those who need it.

17.
SELECTION OF CITATIONS
SEARCH DETAIL
...