Distilling large language models for matching patients to clinical trials.
J Am Med Inform Assoc
; 31(9): 1953-1963, 2024 Sep 01.
Article
em En
| MEDLINE
| ID: mdl-38641416
ABSTRACT
OBJECTIVE:
The objective of this study is to systematically examine the efficacy of both proprietary (GPT-3.5, GPT-4) and open-source large language models (LLMs) (LLAMA 7B, 13B, 70B) in the context of matching patients to clinical trials in healthcare. MATERIALS ANDMETHODS:
The study employs a multifaceted evaluation framework, incorporating extensive automated and human-centric assessments along with a detailed error analysis for each model, and assesses LLMs' capabilities in analyzing patient eligibility against clinical trial's inclusion and exclusion criteria. To improve the adaptability of open-source LLMs, a specialized synthetic dataset was created using GPT-4, facilitating effective fine-tuning under constrained data conditions.RESULTS:
The findings indicate that open-source LLMs, when fine-tuned on this limited and synthetic dataset, achieve performance parity with their proprietary counterparts, such as GPT-3.5.DISCUSSION:
This study highlights the recent success of LLMs in the high-stakes domain of healthcare, specifically in patient-trial matching. The research demonstrates the potential of open-source models to match the performance of proprietary models when fine-tuned appropriately, addressing challenges like cost, privacy, and reproducibility concerns associated with closed-source proprietary LLMs.CONCLUSION:
The study underscores the opportunity for open-source LLMs in patient-trial matching. To encourage further research and applications in this field, the annotated evaluation dataset and the fine-tuned LLM, Trial-LLAMA, are released for public use.Palavras-chave
Texto completo:
1
Coleções:
01-internacional
Base de dados:
MEDLINE
Assunto principal:
Ensaios Clínicos como Assunto
/
Seleção de Pacientes
Limite:
Humans
Idioma:
En
Revista:
J Am Med Inform Assoc
/
J. am. med. inform. assoc
/
Journal of the american medical informatics association
Assunto da revista:
INFORMATICA MEDICA
Ano de publicação:
2024
Tipo de documento:
Article
País de afiliação:
Estados Unidos
País de publicação:
Reino Unido