ABSTRACT
Concept extraction is an important step in clinical natural language processing. Once extracted, the use of concepts can improve the accuracy and generalization of downstream systems. We present a new unsupervised system for the extraction of concepts from clinical text. The system creates representations of concepts from the Unified Medical Language System (UMLS®) by combining natural language descriptions of concepts with word representations, and composing these into higher-order concept vectors. These concept vectors are then used to assign labels to candidate phrases which are extracted using a syntactic chunker. Our approach scores an exact F-score of.32 and an inexact F-score of.45 on the well-known I2b2-2010 challenge corpus, outperforming the only other unsupervised concept extraction method. As our approach relies only on word representations and a chunker, it is completely unsupervised. As such, it can be applied to languages and corpora for which we do not have prior annotations. All our code is open-source and can be found at www.github.com/clips/conch.
Subject(s)
Semantics , Unified Medical Language System , Unsupervised Machine LearningABSTRACT
This article examines the Twitter and Facebook uptake of health messages from an infotainment TV show on food, as broadcasted on Belgium's Dutch-language public broadcaster. The interest in and amount of health-related media coverage is rising, and this media coverage is an important source of information for laypeople, and impacts their health behaviours and therapy compliance. However, the role of the audience has also changed; consumers of media content increasingly are produsers, and, in the case of health, expert consumers. To explore how current audiences react to health claims, we have conducted a quantitative and qualitative content analysis of Twitter and Facebook reactions to an infotainment show about food and nutrition. We examine (1) to which elements in the show the audience reacts, to gain insight in the traction the nutrition-related content generates and (2) whether audience members are accepting or resisting the health information in the show. Our findings show that the information on health and production elicit the most reactions, and that health information incites a lot of refutation, low acceptance and a lot of suggestions on new information or new angles to complement the show's information.
Subject(s)
Consumer Health Information , Health Behavior , Information Dissemination , Social Media , Belgium , Food , HumansABSTRACT
The CEGS N-GRID 2016 Shared Task (Filannino et al., 2017) in Clinical Natural Language Processing introduces the assignment of a severity score to a psychiatric symptom, based on a psychiatric intake report. We present a method that employs the inherent interview-like structure of the report to extract relevant information from the report and generate a representation. The representation consists of a restricted set of psychiatric concepts (and the context they occur in), identified using medical concepts defined in UMLS that are directly related to the psychiatric diagnoses present in the Diagnostic and Statistical Manual of Mental Disorders, 4th Edition (DSM-IV) ontology. Random Forests provides a generalization of the extracted, case-specific features in our representation. The best variant presented here scored an inverse mean absolute error (MAE) of 80.64%. A concise concept-based representation, paired with identification of concept certainty and scope (family, patient), shows a robust performance on the task.