Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 2 de 2
Filter
Add more filters










Database
Language
Publication year range
1.
Cureus ; 15(11): e48788, 2023 Nov.
Article in English | MEDLINE | ID: mdl-38098921

ABSTRACT

Large language models (LLMs) have broad potential applications in medicine, such as aiding with education, providing reassurance to patients, and supporting clinical decision-making. However, there is a notable gap in understanding their applicability and performance in the surgical domain and how their performance varies across specialties. This paper aims to evaluate the performance of LLMs in answering surgical questions relevant to clinical practice and to assess how this performance varies across different surgical specialties. We used the MedMCQA dataset, a large-scale multi-choice question-answer (MCQA) dataset consisting of clinical questions across all areas of medicine. We extracted the relevant 23,035 surgical questions and submitted them to the popular LLMs Generative Pre-trained Transformers (GPT)-3.5 and GPT-4 (OpenAI OpCo, LLC, San Francisco, CA). Generative Pre-trained Transformer is a large language model that can generate human-like text by predicting subsequent words in a sentence based on the context of the words that come before it. It is pre-trained on a diverse range of texts and can perform a variety of tasks, such as answering questions, without needing task-specific training. The question-answering accuracy of GPT was calculated and compared between the two models and across surgical specialties. Both GPT-3.5 and GPT-4 achieved accuracies of 53.3% and 64.4%, respectively, on surgical questions, showing a statistically significant difference in performance. When compared to their performance on the full MedMCQA dataset, the two models performed differently: GPT-4 performed worse on surgical questions than on the dataset as a whole, while GPT-3.5 showed the opposite pattern. Significant variations in accuracy were also observed across different surgical specialties, with strong performances in anatomy, vascular, and paediatric surgery and worse performances in orthopaedics, ENT, and neurosurgery. Large language models exhibit promising capabilities in addressing surgical questions, although the variability in their performance between specialties cannot be ignored. The lower performance of the latest GPT-4 model on surgical questions relative to questions across all medicine highlights the need for targeted improvements and continuous updates to ensure relevance and accuracy in surgical applications. Further research and continuous monitoring of LLM performance in surgical domains are crucial to fully harnessing their potential and mitigating the risks of misinformation.

2.
PeerJ ; 9: e10843, 2021.
Article in English | MEDLINE | ID: mdl-33614289

ABSTRACT

Pangolins, often considered the world's most trafficked wild mammals, have continued to experience rapid declines across Asia and Africa. All eight species are classed as either Vulnerable, Endangered or Critically Endangered by the International Union for Conservation of Nature (IUCN) Red List. Alongside habitat loss, they are threatened mainly by poaching and/or legal hunting to meet the growing consumer demand for their meat and keratinous scales. Species threat assessments heavily rely on changes in species distributions which are usually expensive and difficult to monitor, especially for rare and cryptic species like pangolins. Furthermore, recent assessments of the threats to pangolins focus on characterising their trade using seizure data which provide limited insights into the true extent of global pangolin declines. As the consequences of habitat modifications and poaching/hunting on species continues to become apparent, it is crucial that we frequently update our understanding of how species distributions change through time to allow effective identification of geographic regions that are in need of urgent conservation actions. Here we show how georeferencing pangolin specimens from natural history collections can reveal how their distributions are changing over time, by comparing overlap between specimen localities and current area of habitat maps derived from IUCN range maps. We found significant correlations in percentage area overlap between species, continent, IUCN Red List status and collection year, but not ecology (terrestrial or arboreal/semi-arboreal). Human population density (widely considered to be an indication of trafficking pressure) and changes in primary forest cover, were weakly correlated with percentage overlap. Our results do not suggest a single mechanism for differences among historical distributions and present-day ranges, but rather show that multiple explanatory factors must be considered when researching pangolin population declines as variations among species influence range fluctuations. We also demonstrate how natural history collections can provide temporal information on distributions and discuss the limitations of collecting and using historical data.

SELECTION OF CITATIONS
SEARCH DETAIL
...