Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 12 de 12
Filter
Add more filters










Publication year range
1.
J Biomed Inform ; 141: 104347, 2023 05.
Article in English | MEDLINE | ID: mdl-37030658

ABSTRACT

Automatic extraction of patient medication histories from free-text clinical notes can increase the amount of relevant information to clinicians for developing treatment plans. In addition to detecting medication events, clinical text mining systems must also be able to predict event context, such as negation, uncertainty, and time of occurrence, in order to construct accurate patient timelines. Towards this goal, we introduce Levitated Context Markers (LCMs), a novel transformer-based model for contextualized event extraction. LCMs are an adaptation of levitated markers -originally developed for relation extraction- that allow pretrained transformer models to utilize global input representations while also focusing on event-related subspans using a sparse attention mechanism. In addition to outperforming a strong baseline model on the Contextualized Medication Event Dataset, we show that LCMs' sparse attention can provide interpretable predictions by detecting relevant context cues in an unsupervised manner.


Subject(s)
Data Mining , Records , Humans , Natural Language Processing
2.
J Biomed Inform ; 131: 104120, 2022 07.
Article in English | MEDLINE | ID: mdl-35709900

ABSTRACT

OBJECTIVE: Develop a novel methodology to create a comprehensive knowledge graph (SuppKG) to represent a domain with limited coverage in the Unified Medical Language System (UMLS), specifically dietary supplement (DS) information for discovering drug-supplement interactions (DSI), by leveraging biomedical natural language processing (NLP) technologies and a DS domain terminology. MATERIALS AND METHODS: We created SemRepDS (an extension of an NLP tool, SemRep), capable of extracting semantic relations from abstracts by leveraging a DS-specific terminology (iDISK) containing 28,884 DS terms not found in the UMLS. PubMed abstracts were processed using SemRepDS to generate semantic relations, which were then filtered using a PubMedBERT model to remove incorrect relations before generating SuppKG. Two discovery pathways were applied to SuppKG to identify potential DSIs, which are then compared with an existing DSI database and also evaluated by medical professionals for mechanistic plausibility. RESULTS: SemRepDS returned 158.5% more DS entities and 206.9% more DS relations than SemRep. The fine-tuned PubMedBERT model (significantly outperformed other machine learning and BERT models) obtained an F1 score of 0.8605 and removed 43.86% of semantic relations, improving the precision of the relations by 26.4% over pre-filtering. SuppKG consists of 56,635 nodes and 595,222 directed edges with 2,928 DS-specific nodes and 164,738 edges. Manual review of findings identified 182 of 250 (72.8%) proposed DS-Gene-Drug and 77 of 100 (77%) proposed DS-Gene1-Function-Gene2-Drug pathways to be mechanistically plausible. DISCUSSION: With added DS terminology to the UMLS, SemRepDS has the capability to find more DS-specific semantic relationships from PubMed than SemRep. The utility of the resulting SuppKG was demonstrated using discovery patterns to find novel DSIs. CONCLUSION: For the domain with limited coverage in the traditional terminology (e.g., UMLS), we demonstrated an approach to leverage domain terminology and improve existing NLP tools to generate a more comprehensive knowledge graph for the downstream task. Even this study focuses on DSI, the method may be adapted to other domains.


Subject(s)
Natural Language Processing , Unified Medical Language System , Dietary Supplements , PubMed , Semantics
3.
AMIA Annu Symp Proc ; 2022: 756-765, 2022.
Article in English | MEDLINE | ID: mdl-37128405

ABSTRACT

Remote patient monitoring (RPM) programs are being increasingly utilized in the care of patients to manage acute and chronic disease including with acute COVID-19. The goal of this study is to explore the topics and patterns of patients' messages to the care team in an RPM program in patients with presumed COVID-19. We conducted a topic analysis to 6,262 comments from 3,248 patients enrolled in the COVID-19 RMP at M Health Fairview. Evaluation of comments was performed using LDA and CorEx topic modeling. Subject matter experts evaluated topic models, including identification of and defining topics and categories. Topics plotted over time to identify trends in topic weights over the enrollment period. The overall accuracy of comments assignment to topics by LDA and CorEx models were 72.8% and 88.2%. Most identified topics focused on signs and symptoms of COVID-19. Topics related to COVID-19 diagnosis demonstrated a correlation with announcements of availability of viral and antibody testing in national and local media.


Subject(s)
COVID-19 , Humans , COVID-19 Testing , Monitoring, Physiologic
4.
J Am Med Inform Assoc ; 27(10): 1547-1555, 2020 10 01.
Article in English | MEDLINE | ID: mdl-32940692

ABSTRACT

OBJECTIVE: We sought to assess the need for additional coverage of dietary supplements (DS) in the Unified Medical Language System (UMLS) by investigating (1) the overlap between the integrated DIetary Supplements Knowledge base (iDISK) DS ingredient terminology and the UMLS and (2) the coverage of iDISK and the UMLS over DS mentions in the biomedical literature. MATERIALS AND METHODS: We estimated the overlap between iDISK and the UMLS by mapping iDISK to the UMLS using exact and normalized strings. The coverage of iDISK and the UMLS over DS mentions in the biomedical literature was evaluated via a DS named-entity recognition (NER) task within PubMed abstracts. RESULTS: The coverage analysis revealed that only 30% of iDISK terms can be matched to the UMLS, although these cover over 99% of iDISK concepts. A manual review revealed that a majority of the unmatched terms represented new synonyms, rather than lexical variants. For NER, iDISK nearly doubles the precision and achieves a higher F1 score than the UMLS, while maintaining a competitive recall. DISCUSSION: While iDISK has significant concept overlap with the UMLS, it contains many novel synonyms. Furthermore, almost 3000 of these overlapping UMLS concepts are missing a DS designation, which could be provided by iDISK. The NER experiments show that the specialization of iDISK is useful for identifying DS mentions. CONCLUSIONS: Our results show that the DS representation in the UMLS could be enriched by adding DS designations to many concepts and by adding new synonyms.


Subject(s)
Dietary Supplements , Knowledge Bases , Terminology as Topic , Unified Medical Language System , Natural Language Processing
5.
J Am Med Inform Assoc ; 27(4): 539-548, 2020 04 01.
Article in English | MEDLINE | ID: mdl-32068839

ABSTRACT

OBJECTIVE: To build a knowledge base of dietary supplement (DS) information, called the integrated DIetary Supplement Knowledge base (iDISK), which integrates and standardizes DS-related information from 4 existing resources. MATERIALS AND METHODS: iDISK was built through an iterative process comprising 3 phases: 1) establishment of the content scope, 2) development of the data model, and 3) integration of existing resources. Four well-regarded DS resources were integrated into iDISK: The Natural Medicines Comprehensive Database, the "About Herbs" page on the Memorial Sloan Kettering Cancer Center website, the Dietary Supplement Label Database, and the Natural Health Products Database. We evaluated the iDISK build process by manually checking that the data elements associated with 50 randomly selected ingredients were correctly extracted and integrated from their respective sources. RESULTS: iDISK encompasses a terminology of 4208 DS ingredient concepts, which are linked via 6 relationship types to 495 drugs, 776 diseases, 985 symptoms, 605 therapeutic classes, 17 system organ classes, and 137 568 DS products. iDISK also contains 7 concept attribute types and 3 relationship attribute types. Evaluation of the data extraction and integration process showed average errors of 0.3%, 2.6%, and 0.4% for concepts, relationships and attributes, respectively. CONCLUSION: We developed iDISK, a publicly available standardized DS knowledge base that can facilitate more efficient and meaningful dissemination of DS knowledge.


Subject(s)
Dietary Supplements , Knowledge Bases , Vocabulary, Controlled , Databases, Factual , Humans , Product Labeling , RxNorm , Unified Medical Language System
6.
Stud Health Technol Inform ; 264: 323-327, 2019 Aug 21.
Article in English | MEDLINE | ID: mdl-31437938

ABSTRACT

Despite the high consumption of dietary supplements (DS), few reliable, relevant, and comprehensive online resources could satisfy information seekers. This research study aims to understand consumer information needs on DS using topic modeling, and to evaluate accuracy in correctly identifying topics from social media. We retrieved 16,095 unique questions posted on Yahoo! Answers relating to 438 unique DS ingredients mentioned in sub-section, "Alternative medicine" under the section, "Health" . We implemented an unsupervised topic modeling method, Correlation Explanation (CorEx) to unveil the various topics in which consumers are most interested. We manually reviewed the keywords of all the 200 topics generated by CorEx and assigned them to 38 health-related categories, corresponding to 12 higher-level groups. We found high accuracy (90-100%) in identifying questions that correctly align with the selected topics. The results could guide us to generate a more comprehensive and structured DS resource based on consumers' information needs.


Subject(s)
Social Media , Dietary Supplements
7.
Stud Health Technol Inform ; 264: 408-412, 2019 Aug 21.
Article in English | MEDLINE | ID: mdl-31437955

ABSTRACT

The use of dietary supplements (DSs) is increasing in the U.S. As such, it is crucial for consumers, clinicians, and researchers to be able to find information about DS products. However, labeling regulations allow great variability in DS product names, which makes searching for this information difficult. Following the RxNorm drug name normalization model, we developed a rule-based natural language processing system to normalize DS product names using pattern templates. We evaluated the system on product names extracted from the Dietary Supplement Label Database. Our system generated 136 unique templates and obtained a coverage of 72%, a 32% increase over the existing RxNorm model. Manual review showed that our system achieved a normalization accuracy of 0.86. We found that the normalization of DS product names is feasible, but more work is required to improve the generalizability of the system.


Subject(s)
Dietary Supplements , RxNorm , Databases, Factual , Natural Language Processing
8.
BMC Med Inform Decis Mak ; 19(Suppl 4): 150, 2019 08 08.
Article in English | MEDLINE | ID: mdl-31391091

ABSTRACT

BACKGROUND: Dietary supplements (DSs) are widely used. However, consumers know little about the safety and efficacy of DSs. There is a growing interest in accessing health information online; however, health information, especially online information on DSs, is scattered with varying levels of quality. In our previous work, we prototyped a web application, ALOHA, with interactive graph-based visualization to facilitate consumers' browsing of the integrated DIetary Supplement Knowledge base (iDISK) curated from scientific resources, following an iterative user-centered design (UCD) process. METHODS: Following UCD principles, we carried out two design iterations to enrich the functionalities of ALOHA and enhance its usability. For each iteration, we conducted a usability assessment and design session with a focus group of 8-10 participants and evaluated the usability with a modified System Usability Scale (SUS). Through thematic analysis, we summarized the identified usability issues and conducted a heuristic evaluation to map them to the Gerhardt-Powals' cognitive engineering principles. We derived suggested improvements from each of the usability assessment session and enhanced ALOHA accordingly in the next design iteration. RESULTS: The SUS score in the second design iteration decreased to 52.2 ± 11.0 from 63.75 ± 7.2 in our original work, possibly due to the high number of new functionalities we introduced. By refining existing functionalities to make the user interface simpler, the SUS score increased to 64.4 ± 7.2 in the third design iteration. All participants agreed that such an application is urgently needed to address the gaps in how DS information is currently organized and consumed online. Moreover, most participants thought that the graph-based visualization in ALOHA is a creative and visually appealing format to obtain health information. CONCLUSIONS: In this study, we improved a novel interactive visualization platform, ALOHA, for the general public to obtain DS-related information through two UCD design iterations. The lessons learned from the two design iterations could serve as a guide to further enhance ALOHA and the development of other knowledge graph-based applications. Our study also showed that graph-based interactive visualization is a novel and acceptable approach to end-users who are interested in seeking online health information of various domains.


Subject(s)
Dietary Supplements , Health Knowledge, Attitudes, Practice , Data Display , Focus Groups , Heuristics , Humans , Patient Education as Topic , Pattern Recognition, Automated , Software , User-Computer Interface
9.
AMIA Jt Summits Transl Sci Proc ; 2019: 258-266, 2019.
Article in English | MEDLINE | ID: mdl-31258978

ABSTRACT

Dietary supplement adverse events are potentially severe, yet knowledge regarding the safety of dietary supplements is limited. The CFSAN Adverse Event Reporting System (CAERS) contains records of adverse events attributed to supplements and is potentially useful for dietary supplement pharmacovigilance. This study investigates the feasibility of mining CAERS for dietary supplement adverse events as well as for monitoring the safety of dietary supplement products. Using three online resources, we mapped products in CAERS to their listed ingredients. We then ran four standard signal detection algorithms over the ingredient-adverse event and product-adverse event pairs extracted from CAERS and ranked the detected associations. Comparing 130 signals detected by all four algorithms with a dietary supplement resource, we found evidence for 73 (56%) associations. In addition, some detected product-adverse event signals were consistent with product safety information. We have made a database of the detected adverse events publicly available at https://github.com/zhang-informatics/DDSAE.

10.
AMIA Jt Summits Transl Sci Proc ; 2017: 207-216, 2018.
Article in English | MEDLINE | ID: mdl-29888074

ABSTRACT

Dietary supplements, often considered as food, are widely consumed despite of limited knowledge around their safety/efficacy and any well-established regulatory policies, unlike their drug counterparts. Informatics methods may be useful in filling this knowledge gap, however, the lack of standardized representation of DS hinders this progress. In this pilot study, five electronic DS resources, i.e., NM, DSID & NHPID (ingredient level) and DSLD & LNHPD (product level), were evaluated and compared both quantitatively and qualitatively employing four phases. Essential data elements needed for comprehensive DS representation were compiled based on LanguaL code (food) & AHFSA (drugs) guidelines and employed as a check-list. We further investigated the completeness of DS representation by incorporating Ginseng and Fish oil as examples. We found fragmented and inconsistent distribution of DS representation in terms of essential data elements across five resources. This study provides a preliminary platform for development of standardized DS terminology/ontology model.

11.
Article in English | MEDLINE | ID: mdl-31667004

ABSTRACT

Dietary supplements (DS) are widely consumed. However, most people have limited knowledge about the safety and efficacy of DS. Even though there exists the well-curated integrated DIetary Supplement Knowledge base (iDISK) with a formal knowledge representation, it lacks a user-friendly interface for general consumers to query and retrieve DS information relevant to their needs. Following user-centered design principles, we prototyped a web application, ALOHA (i.e., dietAry suppLement knOwledge grapH visuAlization), with interactive graph-based visualization to facilitate consumers' browsing of iDISK. We conducted a usability inspection and design session with a focus group and evaluated the usability of the prototype with a modified System Usability Scale (SUS). The SUS result was marginal (63.75 ± 7.2 with 1 outlier removed). Nevertheless, all participants agreed that such an application is urgently needed to address the gaps in how DS information (and health information in general) are currently organized and consumed online. These feedbacks are valuable to inform the next iteration of ALOHA.

12.
JAMIA Open ; 1(2): 275-282, 2018 Oct.
Article in English | MEDLINE | ID: mdl-30740594

ABSTRACT

OBJECTIVES: This study evaluated and compared a variety of active learning strategies, including a novel strategy we proposed, as applied to the task of filtering incorrect semantic predications in SemMedDB. MATERIALS AND METHODS: We evaluated 8 active learning strategies covering 3 types-uncertainty, representative, and combined-on 2 datasets of 6,000 total semantic predications from SemMedDB covering the domains of substance interactions and clinical medicine, respectively. We also designed a novel combined strategy called dynamic ß that does not use hand-tuned hyperparameters. Each strategy was assessed by the Area under the Learning Curve (ALC) and the number of training examples required to achieve a target Area Under the ROC curve. We also visualized and compared the query patterns of the query strategies. RESULTS: All types of active learning (AL) methods beat the baseline on both datasets. Combined strategies outperformed all other methods in terms of ALC, outperforming the baseline by over 0.05 ALC for both datasets and reducing 58% annotation efforts in the best case. While representative strategies performed well, their performance was matched or outperformed by the combined methods. Our proposed AL method dynamic ß shows promising ability to achieve near-optimal performance across 2 datasets. DISCUSSION: Our visual analysis of query patterns indicates that strategies which efficiently obtain a representative subsample perform better on this task. CONCLUSION: Active learning is shown to be effective at reducing annotation costs for filtering incorrect semantic predications from SemMedDB. Our proposed AL method demonstrated promising performance.

SELECTION OF CITATIONS
SEARCH DETAIL
...