Pesquisa | Portal Regional da BVS

Sample Size Considerations for Fine-Tuning Large Language Models for Named Entity Recognition Tasks: Methodological Study.

Majdik, Zoltan P; Graham, S Scott; Shiva Edward, Jade C; Rodriguez, Sabrina N; Karnes, Martha S; Jensen, Jared T; Barbour, Joshua B; Rousseau, Justin F.

JMIR AI ; 3: e52095, 2024 May 16.

Artigo em Inglês | MEDLINE | ID: mdl-38875593

RESUMO

BACKGROUND: Large language models (LLMs) have the potential to support promising new applications in health informatics. However, practical data on sample size considerations for fine-tuning LLMs to perform specific tasks in biomedical and health policy contexts are lacking. OBJECTIVE: This study aims to evaluate sample size and sample selection techniques for fine-tuning LLMs to support improved named entity recognition (NER) for a custom data set of conflicts of interest disclosure statements. METHODS: A random sample of 200 disclosure statements was prepared for annotation. All "PERSON" and "ORG" entities were identified by each of the 2 raters, and once appropriate agreement was established, the annotators independently annotated an additional 290 disclosure statements. From the 490 annotated documents, 2500 stratified random samples in different size ranges were drawn. The 2500 training set subsamples were used to fine-tune a selection of language models across 2 model architectures (Bidirectional Encoder Representations from Transformers [BERT] and Generative Pre-trained Transformer [GPT]) for improved NER, and multiple regression was used to assess the relationship between sample size (sentences), entity density (entities per sentence [EPS]), and trained model performance (F1-score). Additionally, single-predictor threshold regression models were used to evaluate the possibility of diminishing marginal returns from increased sample size or entity density. RESULTS: Fine-tuned models ranged in topline NER performance from F1-score=0.79 to F1-score=0.96 across architectures. Two-predictor multiple linear regression models were statistically significant with multiple R2 ranging from 0.6057 to 0.7896 (all P<.001). EPS and the number of sentences were significant predictors of F1-scores in all cases ( P<.001), except for the GPT-2_large model, where EPS was not a significant predictor (P=.184). Model thresholds indicate points of diminishing marginal return from increased training data set sample size measured by the number of sentences, with point estimates ranging from 439 sentences for RoBERTa_large to 527 sentences for GPT-2_large. Likewise, the threshold regression models indicate a diminishing marginal return for EPS with point estimates between 1.36 and 1.38. CONCLUSIONS: Relatively modest sample sizes can be used to fine-tune LLMs for NER tasks applied to biomedical text, and training data entity density should representatively approximate entity density in production data. Training data quality and a model architecture's intended use (text generation vs text processing or classification) may be as, or more, important as training data volume and model parameter size.

A Content Analysis of Self-Reported Financial Relationships in Biomedical Research.

Graham, S Scott; Sharma, Nandini; Karnes, Martha S; Majdik, Zoltan P; Barbour, Joshua B; Rousseau, Justin F.

AJOB Empir Bioeth ; 14(2): 91-98, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-36576202

RESUMO

INTRODUCTION: Financial conflicts of interest (fCOI) present well documented risks to the integrity of biomedical research. However, few studies differentiate among fCOI types in their analyses, and those that do tend to use preexisting taxonomies for fCOI identification. Research on fCOI would benefit from an empirically-derived taxonomy of self-reported fCOI and data on fCOI type and payor prevalence. METHODS: We conducted a content analysis of 6,165 individual self-reported relationships from COI statements distributed across 378 articles indexed with PubMed. Two coders used an iterative coding process to identify and classify individual fCOI types and payors. Inter-rater reliability was κ = 0.935 for fCOI type and κ = 0.884 for payor identification. RESULTS: Our analysis identified 21 fCOI types, 9 of which occurred at prevalences greater than 1%. These included research funding (24.8%), speaking fees (20.8%), consulting fees (18.8%), advisory relationships (11%), industry employment (7.6%), unspecified fees (4.8%), travel fees (3.2%), stock holdings (3.1%), and patent ownership (1%). Reported fCOI were held with 1,077 unique payors, 22 of which were present in more than 1% of financial relationships. The ten most common payors included Pfizer (4%), Novartis (3.9%), MSD (3.8%), Bristol Myers Squibb (3.2%), AstraZeneca (3.1%), GSK (3%), Boehringer Ingelheim (2.9%), Roche (2.8%), Eli LIlly (2.5%), and AbbVie (2.4%). CONCLUSIONS: These results provide novel multi-domain prevalence data on self-reported fCOI and payors in biomedical research. As such, they have the potential to catalyze future research that can assess the differential effects of various types of fCOI. Specifically, the data suggest that comparative analyses of the effects of different fCOI types are needed and that special attention should be paid to the diversity of payor types for research relationships.

Assuntos

Pesquisa Biomédica , Humanos , Autorrelato , Reprodutibilidade dos Testes , Conflito de Interesses , Indústrias

Evidence for stratified conflicts of interest policies in research contexts: a methodological review.

Graham, S Scott; Karnes, Martha S; Jensen, Jared T; Sharma, Nandini; Barbour, Joshua B; Majdik, Zoltan P; Rousseau, Justin F.

BMJ Open ; 12(9): e063501, 2022 09 19.

Artigo em Inglês | MEDLINE | ID: mdl-36123074

RESUMO

OBJECTIVES: The purpose of this study was to conduct a methodological review of research on the effects of conflicts of interest (COIs) in research contexts. DESIGN: Methodological review. DATA SOURCES: Ovid. ELIGIBILITY CRITERIA: Studies published between 1986 and 2021 conducting quantitative assessments of relationships between industry funding or COI and four target outcomes: positive study results, methodological biases, reporting quality and results-conclusions concordance. DATA EXTRACTION AND SYNTHESIS: We assessed key facets of study design: our primary analysis identified whether studies stratified industry funding or COI variables by magnitude (ie, number of COI or disbursement amount), type (employment, travel fees, speaking fees) or if they assessed dichotomous variables (ie, conflict present or absent). Secondary analyses focused on target outcomes and available effects measures. RESULTS: Of the 167 articles included in this study, a substantial majority (98.2%) evaluated the effects of industry sponsorship. None evaluated associations between funding magnitude and outcomes of interest. Seven studies (4.3%) stratified industry funding based on the mechanism of disbursement or funder relationship to product (manufacturer or competitor). A fifth of the articles (19.8%) assessed the effects of author COI on target outcomes. None evaluated COI magnitude, and three studies (9.1%) stratified COI by disbursement type and/or reporting practices. Participation of an industry-employed author showed the most consistent effect on favourability of results across studies. CONCLUSIONS: Substantial evidence demonstrates that industry funding and COI can bias biomedical research. Evidence-based policies are essential for mitigating the risks associated with COI. Although most policies stratify guidelines for managing COI, differentiating COIs based on the type of relationship or monetary value, this review shows that the available research has generally not been designed to assess the differential risks of COI types or magnitudes. Targeted research is necessary to establish an evidence base that can effectively inform policy to manage COI.

Assuntos

Pesquisa Biomédica , Conflito de Interesses , Revelação , Humanos , Indústrias , Políticas

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA