Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 2 de 2
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
J Am Med Inform Assoc ; 31(6): 1404-1410, 2024 May 20.
Artigo em Inglês | MEDLINE | ID: mdl-38622901

RESUMO

OBJECTIVES: To compare performances of a classifier that leverages language models when trained on synthetic versus authentic clinical notes. MATERIALS AND METHODS: A classifier using language models was developed to identify acute renal failure. Four types of training data were compared: (1) notes from MIMIC-III; and (2, 3, and 4) synthetic notes generated by ChatGPT of varied text lengths of 15 (GPT-15 sentences), 30 (GPT-30 sentences), and 45 (GPT-45 sentences) sentences, respectively. The area under the receiver operating characteristics curve (AUC) was calculated from a test set from MIMIC-III. RESULTS: With RoBERTa, the AUCs were 0.84, 0.80, 0.84, and 0.76 for the MIMIC-III, GPT-15, GPT-30- and GPT-45 sentences training sets, respectively. DISCUSSION: Training language models to detect acute renal failure from clinical notes resulted in similar performances when using synthetic versus authentic training data. CONCLUSION: The use of training data derived from protected health information may not be needed.


Assuntos
Injúria Renal Aguda , Inteligência Artificial , Registros Eletrônicos de Saúde , Humanos , Injúria Renal Aguda/classificação , Injúria Renal Aguda/diagnóstico , Curva ROC , Processamento de Linguagem Natural , Área Sob a Curva , Conjuntos de Dados como Assunto
2.
Sci Rep ; 14(1): 85, 2024 01 02.
Artigo em Inglês | MEDLINE | ID: mdl-38168099

RESUMO

The emergence of long COVID during the ongoing COVID-19 pandemic has presented considerable challenges for healthcare professionals and researchers. The task of identifying relevant literature is particularly daunting due to the rapidly evolving scientific landscape, inconsistent definitions, and a lack of standardized nomenclature. This paper proposes a novel solution to this challenge by employing machine learning techniques to classify long COVID literature. However, the scarcity of annotated data for machine learning poses a significant obstacle. To overcome this, we introduce a strategy called medical paraphrasing, which diversifies the training data while maintaining the original content. Additionally, we propose a Data-Reweighting-Based Multi-Level Optimization Framework for Domain Adaptive Paraphrasing, supported by a Meta-Weight-Network (MWN). This innovative approach incorporates feedback from the downstream text classification model to influence the training of the paraphrasing model. During the training process, the framework assigns higher weights to the training examples that contribute more effectively to the downstream task of long COVID text classification. Our findings demonstrate that this method substantially improves the accuracy and efficiency of long COVID literature classification, offering a valuable tool for physicians and researchers navigating this complex and ever-evolving field.


Assuntos
COVID-19 , Síndrome de COVID-19 Pós-Aguda , Humanos , Pandemias , Aprendizado de Máquina , Pessoal de Saúde
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...