Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
J Am Med Inform Assoc ; 30(12): 2036-2040, 2023 11 17.
Artigo em Inglês | MEDLINE | ID: mdl-37555837

RESUMO

Despite recent methodology advancements in clinical natural language processing (NLP), the adoption of clinical NLP models within the translational research community remains hindered by process heterogeneity and human factor variations. Concurrently, these factors also dramatically increase the difficulty in developing NLP models in multi-site settings, which is necessary for algorithm robustness and generalizability. Here, we reported on our experience developing an NLP solution for Coronavirus Disease 2019 (COVID-19) signs and symptom extraction in an open NLP framework from a subset of sites participating in the National COVID Cohort (N3C). We then empirically highlight the benefits of multi-site data for both symbolic and statistical methods, as well as highlight the need for federated annotation and evaluation to resolve several pitfalls encountered in the course of these efforts.


Assuntos
COVID-19 , Processamento de Linguagem Natural , Humanos , Registros Eletrônicos de Saúde , Algoritmos
2.
BMC Med Inform Decis Mak ; 22(Suppl 1): 88, 2022 07 07.
Artigo em Inglês | MEDLINE | ID: mdl-35799294

RESUMO

BACKGROUND: Since no effective therapies exist for Alzheimer's disease (AD), prevention has become more critical through lifestyle status changes and interventions. Analyzing electronic health records (EHRs) of patients with AD can help us better understand lifestyle's effect on AD. However, lifestyle information is typically stored in clinical narratives. Thus, the objective of the study was to compare different natural language processing (NLP) models on classifying the lifestyle statuses (e.g., physical activity and excessive diet) from clinical texts in English. METHODS: Based on the collected concept unique identifiers (CUIs) associated with the lifestyle status, we extracted all related EHRs for patients with AD from the Clinical Data Repository (CDR) of the University of Minnesota (UMN). We automatically generated labels for the training data by using a rule-based NLP algorithm. We conducted weak supervision for pre-trained Bidirectional Encoder Representations from Transformers (BERT) models and three traditional machine learning models as baseline models on the weakly labeled training corpus. These models include the BERT base model, PubMedBERT (abstracts + full text), PubMedBERT (only abstracts), Unified Medical Language System (UMLS) BERT, Bio BERT, Bio-clinical BERT, logistic regression, support vector machine, and random forest. The rule-based model used for weak supervision was tested on the GSC for comparison. We performed two case studies: physical activity and excessive diet, in order to validate the effectiveness of BERT models in classifying lifestyle status for all models were evaluated and compared on the developed Gold Standard Corpus (GSC) on the two case studies. RESULTS: The UMLS BERT model achieved the best performance for classifying status of physical activity, with its precision, recall, and F-1 scores of 0.93, 0.93, and 0.92, respectively. Regarding classifying excessive diet, the Bio-clinical BERT model showed the best performance with precision, recall, and F-1 scores of 0.93, 0.93, and 0.93, respectively. CONCLUSION: The proposed approach leveraging weak supervision could significantly increase the sample size, which is required for training the deep learning models. By comparing with the traditional machine learning models, the study also demonstrates the high performance of BERT models for classifying lifestyle status for Alzheimer's disease in clinical notes.


Assuntos
Doença de Alzheimer , Aprendizado Profundo , Humanos , Estilo de Vida , Processamento de Linguagem Natural , Unified Medical Language System
3.
J Biomed Inform ; 131: 104120, 2022 07.
Artigo em Inglês | MEDLINE | ID: mdl-35709900

RESUMO

OBJECTIVE: Develop a novel methodology to create a comprehensive knowledge graph (SuppKG) to represent a domain with limited coverage in the Unified Medical Language System (UMLS), specifically dietary supplement (DS) information for discovering drug-supplement interactions (DSI), by leveraging biomedical natural language processing (NLP) technologies and a DS domain terminology. MATERIALS AND METHODS: We created SemRepDS (an extension of an NLP tool, SemRep), capable of extracting semantic relations from abstracts by leveraging a DS-specific terminology (iDISK) containing 28,884 DS terms not found in the UMLS. PubMed abstracts were processed using SemRepDS to generate semantic relations, which were then filtered using a PubMedBERT model to remove incorrect relations before generating SuppKG. Two discovery pathways were applied to SuppKG to identify potential DSIs, which are then compared with an existing DSI database and also evaluated by medical professionals for mechanistic plausibility. RESULTS: SemRepDS returned 158.5% more DS entities and 206.9% more DS relations than SemRep. The fine-tuned PubMedBERT model (significantly outperformed other machine learning and BERT models) obtained an F1 score of 0.8605 and removed 43.86% of semantic relations, improving the precision of the relations by 26.4% over pre-filtering. SuppKG consists of 56,635 nodes and 595,222 directed edges with 2,928 DS-specific nodes and 164,738 edges. Manual review of findings identified 182 of 250 (72.8%) proposed DS-Gene-Drug and 77 of 100 (77%) proposed DS-Gene1-Function-Gene2-Drug pathways to be mechanistically plausible. DISCUSSION: With added DS terminology to the UMLS, SemRepDS has the capability to find more DS-specific semantic relationships from PubMed than SemRep. The utility of the resulting SuppKG was demonstrated using discovery patterns to find novel DSIs. CONCLUSION: For the domain with limited coverage in the traditional terminology (e.g., UMLS), we demonstrated an approach to leverage domain terminology and improve existing NLP tools to generate a more comprehensive knowledge graph for the downstream task. Even this study focuses on DSI, the method may be adapted to other domains.


Assuntos
Processamento de Linguagem Natural , Unified Medical Language System , Suplementos Nutricionais , PubMed , Semântica
4.
Am J Otolaryngol ; 43(4): 103510, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35636088

RESUMO

OBJECTIVE: Scribes in medical practice enable more efficient documentation requirements but insufficient analyses have occurred to fully evaluate their efficacy in otolaryngology. We analyzed pre/post metrics of scribe implementation that may aid practitioners in determining feasibility for use in their practices. METHODS: 1808 patient charts were analyzed in The Epic Electronic Medical Record system (EMR) (903 pre and 905 post scribe implementation). We measured: clinic volumes, time saved in documentation, chart billing level, and lag days of chart closure. RESULTS: Patient volumes increased by 3.02% with an 11-17% decrease in time spent in clinic/day and lag days for billing. The distribution of visits for new patients was 17.75% level 2, 51.45% level 3, 29.71% level 4 before the scribe and was 6.83% level 2, 89.21% level 3, 3.96% level 4 after the scribe. For established patients it was 3.97% level 2, 84.92% level 3, 8.93% level 4 before and 0.34% level 2, 91.76% level 3, 7.73% level 4 after. The change in level of documentation for established and new patients pre and post scribe implementation was not statistically significant (p = 0.821, 0.063, respectively). Charts were closed within 0 to 7 days with the implementation of a scribe instead of 7-21 days when awaiting dictations for transcription. CONCLUSIONS: The implementation of a scribe in an academic otolaryngology clinic facilitated more rapid completion of documentation while decreasing provider hours/day in clinic. We feel the analysis can be generalized to otolaryngology practitioners in general and the data structures we implemented are usable for others.


Assuntos
Otolaringologia , Satisfação do Paciente , Instituições de Assistência Ambulatorial , Documentação , Eficiência , Humanos
5.
AMIA Annu Symp Proc ; 2022: 756-765, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-37128405

RESUMO

Remote patient monitoring (RPM) programs are being increasingly utilized in the care of patients to manage acute and chronic disease including with acute COVID-19. The goal of this study is to explore the topics and patterns of patients' messages to the care team in an RPM program in patients with presumed COVID-19. We conducted a topic analysis to 6,262 comments from 3,248 patients enrolled in the COVID-19 RMP at M Health Fairview. Evaluation of comments was performed using LDA and CorEx topic modeling. Subject matter experts evaluated topic models, including identification of and defining topics and categories. Topics plotted over time to identify trends in topic weights over the enrollment period. The overall accuracy of comments assignment to topics by LDA and CorEx models were 72.8% and 88.2%. Most identified topics focused on signs and symptoms of COVID-19. Topics related to COVID-19 diagnosis demonstrated a correlation with announcements of availability of viral and antibody testing in national and local media.


Assuntos
COVID-19 , Humanos , Teste para COVID-19 , Monitorização Fisiológica
6.
JAMIA Open ; 4(4): ooab081, 2021 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-34632323

RESUMO

OBJECTIVE: The objective of this study is to develop a deep learning pipeline to detect signals on dietary supplement-related adverse events (DS AEs) from Twitter. MATERIALS AND METHODS: We obtained 247 807 tweets ranging from 2012 to 2018 that mentioned both DS and AE. We designed a tailor-made annotation guideline for DS AEs and annotated biomedical entities and relations on 2000 tweets. For the concept extraction task, we fine-tuned and compared the performance of BioClinical-BERT, PubMedBERT, ELECTRA, RoBERTa, and DeBERTa models with a CRF classifier. For the relation extraction task, we fine-tuned and compared BERT models to BioClinical-BERT, PubMedBERT, RoBERTa, and DeBERTa models. We chose the best-performing models in each task to assemble an end-to-end deep learning pipeline to detect DS AE signals and compared the results to the known DS AEs from a DS knowledge base (ie, iDISK). RESULTS: DeBERTa-CRF model outperformed other models in the concept extraction task, scoring a lenient microaveraged F1 score of 0.866. RoBERTa model outperformed other models in the relation extraction task, scoring a lenient microaveraged F1 score of 0.788. The end-to-end pipeline built on these 2 models was able to extract DS indication and DS AEs with a lenient microaveraged F1 score of 0.666. CONCLUSION: We have developed a deep learning pipeline that can detect DS AE signals from Twitter. We have found DS AEs that were not recorded in an existing knowledge base (iDISK) and our proposed pipeline can as sist DS AE pharmacovigilance.

7.
J Biomed Inform ; 115: 103696, 2021 03.
Artigo em Inglês | MEDLINE | ID: mdl-33571675

RESUMO

OBJECTIVE: To discover candidate drugs to repurpose for COVID-19 using literature-derived knowledge and knowledge graph completion methods. METHODS: We propose a novel, integrative, and neural network-based literature-based discovery (LBD) approach to identify drug candidates from PubMed and other COVID-19-focused research literature. Our approach relies on semantic triples extracted using SemRep (via SemMedDB). We identified an informative and accurate subset of semantic triples using filtering rules and an accuracy classifier developed on a BERT variant. We used this subset to construct a knowledge graph, and applied five state-of-the-art, neural knowledge graph completion algorithms (i.e., TransE, RotatE, DistMult, ComplEx, and STELP) to predict drug repurposing candidates. The models were trained and assessed using a time slicing approach and the predicted drugs were compared with a list of drugs reported in the literature and evaluated in clinical trials. These models were complemented by a discovery pattern-based approach. RESULTS: Accuracy classifier based on PubMedBERT achieved the best performance (F1 = 0.854) in identifying accurate semantic predications. Among five knowledge graph completion models, TransE outperformed others (MR = 0.923, Hits@1 = 0.417). Some known drugs linked to COVID-19 in the literature were identified, as well as others that have not yet been studied. Discovery patterns enabled identification of additional candidate drugs and generation of plausible hypotheses regarding the links between the candidate drugs and COVID-19. Among them, five highly ranked and novel drugs (i.e., paclitaxel, SB 203580, alpha 2-antiplasmin, metoclopramide, and oxymatrine) and the mechanistic explanations for their potential use are further discussed. CONCLUSION: We showed that a LBD approach can be feasible not only for discovering drug candidates for COVID-19, but also for generating mechanistic explanations. Our approach can be generalized to other diseases as well as to other clinical questions. Source code and data are available at https://github.com/kilicogluh/lbd-covid.


Assuntos
Tratamento Farmacológico da COVID-19 , Reposicionamento de Medicamentos , Descoberta do Conhecimento , Algoritmos , Antivirais/uso terapêutico , COVID-19/virologia , Humanos , Redes Neurais de Computação , SARS-CoV-2/isolamento & purificação
8.
ArXiv ; 2021 Feb 05.
Artigo em Inglês | MEDLINE | ID: mdl-33564698

RESUMO

OBJECTIVE: To discover candidate drugs to repurpose for COVID-19 using literature-derived knowledge and knowledge graph completion methods. METHODS: We propose a novel, integrative, and neural network-based literature-based discovery (LBD) approach to identify drug candidates from both PubMed and COVID-19-focused research literature. Our approach relies on semantic triples extracted using SemRep (via SemMedDB). We identified an informative subset of semantic triples using filtering rules and an accuracy classifier developed on a BERT variant, and used this subset to construct a knowledge graph. Five SOTA, neural knowledge graph completion algorithms were used to predict drug repurposing candidates. The models were trained and assessed using a time slicing approach and the predicted drugs were compared with a list of drugs reported in the literature and evaluated in clinical trials. These models were complemented by a discovery pattern-based approach. RESULTS: Accuracy classifier based on PubMedBERT achieved the best performance (F1= 0.854) in classifying semantic predications. Among five knowledge graph completion models, TransE outperformed others (MR = 0.923, Hits@1=0.417). Some known drugs linked to COVID-19 in the literature were identified, as well as some candidate drugs that have not yet been studied. Discovery patterns enabled generation of plausible hypotheses regarding the relationships between the candidate drugs and COVID-19. Among them, five highly ranked and novel drugs (paclitaxel, SB 203580, alpha 2-antiplasmin, pyrrolidine dithiocarbamate, and butylated hydroxytoluene) with their mechanistic explanations were further discussed. CONCLUSION: We show that an LBD approach can be feasible for discovering drug candidates for COVID-19, and for generating mechanistic explanations. Our approach can be generalized to other diseases as well as to other clinical questions.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...