Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Artigo em Inglês | MEDLINE | ID: mdl-39001795

RESUMO

OBJECTIVES: Alzheimer's disease (AD) is the most common form of dementia in the United States. Sleep is one of the lifestyle-related factors that has been shown critical for optimal cognitive function in old age. However, there is a lack of research studying the association between sleep and AD incidence. A major bottleneck for conducting such research is that the traditional way to acquire sleep information is time-consuming, inefficient, non-scalable, and limited to patients' subjective experience. We aim to automate the extraction of specific sleep-related patterns, such as snoring, napping, poor sleep quality, daytime sleepiness, night wakings, other sleep problems, and sleep duration, from clinical notes of AD patients. These sleep patterns are hypothesized to play a role in the incidence of AD, providing insight into the relationship between sleep and AD onset and progression. MATERIALS AND METHODS: A gold standard dataset is created from manual annotation of 570 randomly sampled clinical note documents from the adSLEEP, a corpus of 192 000 de-identified clinical notes of 7266 AD patients retrieved from the University of Pittsburgh Medical Center (UPMC). We developed a rule-based natural language processing (NLP) algorithm, machine learning models, and large language model (LLM)-based NLP algorithms to automate the extraction of sleep-related concepts, including snoring, napping, sleep problem, bad sleep quality, daytime sleepiness, night wakings, and sleep duration, from the gold standard dataset. RESULTS: The annotated dataset of 482 patients comprised a predominantly White (89.2%), older adult population with an average age of 84.7 years, where females represented 64.1%, and a vast majority were non-Hispanic or Latino (94.6%). Rule-based NLP algorithm achieved the best performance of F1 across all sleep-related concepts. In terms of positive predictive value (PPV), the rule-based NLP algorithm achieved the highest PPV scores for daytime sleepiness (1.00) and sleep duration (1.00), while the machine learning models had the highest PPV for napping (0.95) and bad sleep quality (0.86), and LLAMA2 with finetuning had the highest PPV for night wakings (0.93) and sleep problem (0.89). DISCUSSION: Although sleep information is infrequently documented in the clinical notes, the proposed rule-based NLP algorithm and LLM-based NLP algorithms still achieved promising results. In comparison, the machine learning-based approaches did not achieve good results, which is due to the small size of sleep information in the training data. CONCLUSION: The results show that the rule-based NLP algorithm consistently achieved the best performance for all sleep concepts. This study focused on the clinical notes of patients with AD but could be extended to general sleep information extraction for other diseases.

2.
JMIR Med Inform ; 12: e55318, 2024 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-38587879

RESUMO

BACKGROUND: Large language models (LLMs) have shown remarkable capabilities in natural language processing (NLP), especially in domains where labeled data are scarce or expensive, such as the clinical domain. However, to unlock the clinical knowledge hidden in these LLMs, we need to design effective prompts that can guide them to perform specific clinical NLP tasks without any task-specific training data. This is known as in-context learning, which is an art and science that requires understanding the strengths and weaknesses of different LLMs and prompt engineering approaches. OBJECTIVE: The objective of this study is to assess the effectiveness of various prompt engineering techniques, including 2 newly introduced types-heuristic and ensemble prompts, for zero-shot and few-shot clinical information extraction using pretrained language models. METHODS: This comprehensive experimental study evaluated different prompt types (simple prefix, simple cloze, chain of thought, anticipatory, heuristic, and ensemble) across 5 clinical NLP tasks: clinical sense disambiguation, biomedical evidence extraction, coreference resolution, medication status extraction, and medication attribute extraction. The performance of these prompts was assessed using 3 state-of-the-art language models: GPT-3.5 (OpenAI), Gemini (Google), and LLaMA-2 (Meta). The study contrasted zero-shot with few-shot prompting and explored the effectiveness of ensemble approaches. RESULTS: The study revealed that task-specific prompt tailoring is vital for the high performance of LLMs for zero-shot clinical NLP. In clinical sense disambiguation, GPT-3.5 achieved an accuracy of 0.96 with heuristic prompts and 0.94 in biomedical evidence extraction. Heuristic prompts, alongside chain of thought prompts, were highly effective across tasks. Few-shot prompting improved performance in complex scenarios, and ensemble approaches capitalized on multiple prompt strengths. GPT-3.5 consistently outperformed Gemini and LLaMA-2 across tasks and prompt types. CONCLUSIONS: This study provides a rigorous evaluation of prompt engineering methodologies and introduces innovative techniques for clinical information extraction, demonstrating the potential of in-context learning in the clinical domain. These findings offer clear guidelines for future prompt-based clinical NLP research, facilitating engagement by non-NLP experts in clinical NLP advancements. To the best of our knowledge, this is one of the first works on the empirical evaluation of different prompt engineering approaches for clinical NLP in this era of generative artificial intelligence, and we hope that it will inspire and inform future research in this area.

3.
JMIR Med Inform ; 12: e52289, 2024 Apr 03.
Artigo em Inglês | MEDLINE | ID: mdl-38568736

RESUMO

BACKGROUND: The rehabilitation of a patient who had a stroke requires precise, personalized treatment plans. Natural language processing (NLP) offers the potential to extract valuable exercise information from clinical notes, aiding in the development of more effective rehabilitation strategies. OBJECTIVE: This study aims to develop and evaluate a variety of NLP algorithms to extract and categorize physical rehabilitation exercise information from the clinical notes of patients who had a stroke treated at the University of Pittsburgh Medical Center. METHODS: A cohort of 13,605 patients diagnosed with stroke was identified, and their clinical notes containing rehabilitation therapy notes were retrieved. A comprehensive clinical ontology was created to represent various aspects of physical rehabilitation exercises. State-of-the-art NLP algorithms were then developed and compared, including rule-based, machine learning-based algorithms (support vector machine, logistic regression, gradient boosting, and AdaBoost) and large language model (LLM)-based algorithms (ChatGPT [OpenAI]). The study focused on key performance metrics, particularly F1-scores, to evaluate algorithm effectiveness. RESULTS: The analysis was conducted on a data set comprising 23,724 notes with detailed demographic and clinical characteristics. The rule-based NLP algorithm demonstrated superior performance in most areas, particularly in detecting the "Right Side" location with an F1-score of 0.975, outperforming gradient boosting by 0.063. Gradient boosting excelled in "Lower Extremity" location detection (F1-score: 0.978), surpassing rule-based NLP by 0.023. It also showed notable performance in the "Passive Range of Motion" detection with an F1-score of 0.970, a 0.032 improvement over rule-based NLP. The rule-based algorithm efficiently handled "Duration," "Sets," and "Reps" with F1-scores up to 0.65. LLM-based NLP, particularly ChatGPT with few-shot prompts, achieved high recall but generally lower precision and F1-scores. However, it notably excelled in "Backward Plane" motion detection, achieving an F1-score of 0.846, surpassing the rule-based algorithm's 0.720. CONCLUSIONS: The study successfully developed and evaluated multiple NLP algorithms, revealing the strengths and weaknesses of each in extracting physical rehabilitation exercise information from clinical notes. The detailed ontology and the robust performance of the rule-based and gradient boosting algorithms demonstrate significant potential for enhancing precision rehabilitation. These findings contribute to the ongoing efforts to integrate advanced NLP techniques into health care, moving toward predictive models that can recommend personalized rehabilitation treatments for optimal patient outcomes.

4.
J Healthc Inform Res ; 8(2): 313-352, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38681755

RESUMO

Clinical information retrieval (IR) plays a vital role in modern healthcare by facilitating efficient access and analysis of medical literature for clinicians and researchers. This scoping review aims to offer a comprehensive overview of the current state of clinical IR research and identify gaps and potential opportunities for future studies in this field. The main objective was to assess and analyze the existing literature on clinical IR, focusing on the methods, techniques, and tools employed for effective retrieval and analysis of medical information. Adhering to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines, we conducted an extensive search across databases such as Ovid Embase, Ovid Medline, Scopus, ACM Digital Library, IEEE Xplore, and Web of Science, covering publications from January 1, 2010, to January 4, 2023. The rigorous screening process led to the inclusion of 184 papers in our review. Our findings provide a detailed analysis of the clinical IR research landscape, covering aspects like publication trends, data sources, methodologies, evaluation metrics, and applications. The review identifies key research gaps in clinical IR methods such as indexing, ranking, and query expansion, offering insights and opportunities for future studies in clinical IR, thus serving as a guiding framework for upcoming research efforts in this rapidly evolving field. The study also underscores an imperative for innovative research on advanced clinical IR systems capable of fast semantic vector search and adoption of neural IR techniques for effective retrieval of information from unstructured electronic health records (EHRs). Supplementary Information: The online version contains supplementary material available at 10.1007/s41666-024-00159-4.

5.
J Biomed Inform ; 148: 104544, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37995843

RESUMO

OBJECTIVE: To pre-train fair and unbiased patient representations from Electronic Health Records (EHRs) using a novel weighted loss function that reduces bias and improves fairness in deep representation learning models. METHODS: We defined a new loss function, called weighted loss function, in the deep representation learning model to balance the importance of different groups of patients and features. We applied the proposed model, called Fair Patient Model (FPM), to a sample of 34,739 patients from the MIMIC-III dataset and learned patient representations for four clinical outcome prediction tasks. RESULTS: FPM outperformed the baseline models in terms of three fairness metrics: demographic parity, equality of opportunity difference, and equalized odds ratio. FPM also achieved comparable predictive performance with the baselines, with an average accuracy of 0.7912. Feature analysis revealed that FPM captured more information from clinical features than the baselines. CONCLUSION: FPM is a novel method to pre-train fair and unbiased patient representations from the EHR data using a weighted loss function. The learned representations can be used for various downstream tasks in healthcare and can be extended to other domains where fairness is important.


Assuntos
Benchmarking , Registros Eletrônicos de Saúde , Humanos , Prognóstico
6.
JMIR AI ; 2: e44293, 2023 May 04.
Artigo em Inglês | MEDLINE | ID: mdl-38875537

RESUMO

BACKGROUND: Natural language processing (NLP) has become an emerging technology in health care that leverages a large amount of free-text data in electronic health records to improve patient care, support clinical decisions, and facilitate clinical and translational science research. Recently, deep learning has achieved state-of-the-art performance in many clinical NLP tasks. However, training deep learning models often requires large, annotated data sets, which are normally not publicly available and can be time-consuming to build in clinical domains. Working with smaller annotated data sets is typical in clinical NLP; therefore, ensuring that deep learning models perform well is crucial for real-world clinical NLP applications. A widely adopted approach is fine-tuning existing pretrained language models, but these attempts fall short when the training data set contains only a few annotated samples. Few-shot learning (FSL) has recently been investigated to tackle this problem. Siamese neural network (SNN) has been widely used as an FSL approach in computer vision but has not been studied well in NLP. Furthermore, the literature on its applications in clinical domains is scarce. OBJECTIVE: The aim of our study is to propose and evaluate SNN-based approaches for few-shot clinical NLP tasks. METHODS: We propose 2 SNN-based FSL approaches, including pretrained SNN and SNN with second-order embeddings. We evaluate the proposed approaches on the clinical sentence classification task. We experiment with 3 few-shot settings, including 4-shot, 8-shot, and 16-shot learning. The clinical NLP task is benchmarked using the following 4 pretrained language models: bidirectional encoder representations from transformers (BERT), BERT for biomedical text mining (BioBERT), BioBERT trained on clinical notes (BioClinicalBERT), and generative pretrained transformer 2 (GPT-2). We also present a performance comparison between SNN-based approaches and the prompt-based GPT-2 approach. RESULTS: In 4-shot sentence classification tasks, GPT-2 had the highest precision (0.63), but its recall (0.38) and F score (0.42) were lower than those of BioBERT-based pretrained SNN (0.45 and 0.46, respectively). In both 8-shot and 16-shot settings, SNN-based approaches outperformed GPT-2 in all 3 metrics of precision, recall, and F score. CONCLUSIONS: The experimental results verified the effectiveness of the proposed SNN approaches for few-shot clinical NLP tasks.

7.
Cells ; 11(11)2022 06 02.
Artigo em Inglês | MEDLINE | ID: mdl-35681523

RESUMO

Organ-on-a-chip (OOAC) is an emerging technology based on microfluid platforms and in vitro cell culture that has a promising future in the healthcare industry. The numerous advantages of OOAC over conventional systems make it highly popular. The chip is an innovative combination of novel technologies, including lab-on-a-chip, microfluidics, biomaterials, and tissue engineering. This paper begins by analyzing the need for the development of OOAC followed by a brief introduction to the technology. Later sections discuss and review the various types of OOACs and the fabrication materials used. The implementation of artificial intelligence in the system makes it more advanced, thereby helping to provide a more accurate diagnosis as well as convenient data management. We introduce selected OOAC projects, including applications to organ/disease modelling, pharmacology, personalized medicine, and dentistry. Finally, we point out certain challenges that need to be surmounted in order to further develop and upgrade the current systems.


Assuntos
Inteligência Artificial , Dispositivos Lab-On-A-Chip , Materiais Biocompatíveis , Microfluídica , Engenharia Tecidual
8.
AMIA Annu Symp Proc ; 2022: 972-981, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-37128372

RESUMO

Developing clinical natural language systems based on machine learning and deep learning is dependent on the availability of large-scale annotated clinical text datasets, most of which are time-consuming to create and not publicly available. The lack of such annotated datasets is the biggest bottleneck for the development of clinical NLP systems. Zero-Shot Learning (ZSL) refers to the use of deep learning models to classify instances from new classes of which no training data have been seen before. Prompt-based learning is an emerging ZSL technique in NLP where we define task-based templates for different tasks. In this study, we developed a novel prompt-based clinical NLP framework called HealthPrompt and applied the paradigm of prompt-based learning on clinical texts. In this technique, rather than fine-tuning a Pre-trained Language Model (PLM), the task definitions are tuned by defining a prompt template. We performed an in-depth analysis of HealthPrompt on six different PLMs in a no-training-data setting. Our experiments show that HealthPrompt could effectively capture the context of clinical texts and perform well for clinical NLP tasks without any training data.


Assuntos
Aprendizado de Máquina , Processamento de Linguagem Natural , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...