Search | VHL Regional Portal

Epitweetr: Early warning of public health threats using Twitter data.

Espinosa, Laura; Wijermans, Ariana; Orchard, Francisco; Höhle, Michael; Czernichow, Thomas; Coletti, Pietro; Hermans, Lisa; Faes, Christel; Kissling, Esther; Mollet, Thomas.

Euro Surveill ; 27(39)2022 09.

Article in English | MEDLINE | ID: mdl-36177867

ABSTRACT

BackgroundThe European Centre for Disease Prevention and Control (ECDC) systematically collates information from sources to rapidly detect early public health threats. The lack of a freely available, customisable and automated early warning tool using data from Twitter prompted the ECDC to develop epitweetr, which collects, geolocates and aggregates tweets generating signals and email alerts.AimThis study aims to compare the performance of epitweetr to manually monitoring tweets for the purpose of early detecting public health threats.MethodsWe calculated the general and specific positive predictive value (PPV) of signals generated by epitweetr between 19 October and 30 November 2020. Sensitivity, specificity, timeliness and accuracy and performance of tweet geolocation and signal detection algorithms obtained from epitweetr and the manual monitoring of 1,200 tweets were compared.ResultsThe epitweetr geolocation algorithm had an accuracy of 30.1% at national, and 25.9% at subnational levels. The signal detection algorithm had 3.0% general PPV and 74.6% specific PPV. Compared to manual monitoring, epitweetr had greater sensitivity (47.9% and 78.6%, respectively), and reduced PPV (97.9% and 74.6%, respectively). Median validation time difference between 16 common events detected by epitweetr and manual monitoring was -48.6 hours (IQR: -102.8 to -23.7).ConclusionEpitweetr has shown sufficient performance as an early warning tool for public health threats using Twitter data. Since epitweetr is a free, open-source tool with configurable settings and a strong automated component, it is expected to increase in usability and usefulness to public health experts.

Subject(s)

Public Health , Social Media , Algorithms , Data Collection , Humans

Extraction of Explicit and Implicit Cause-Effect Relationships in Patient-Reported Diabetes-Related Tweets From 2017 to 2021: Deep Learning Approach.

Ahne, Adrian; Khetan, Vivek; Tannier, Xavier; Rizvi, Md Imbesat Hassan; Czernichow, Thomas; Orchard, Francisco; Bour, Charline; Fano, Andrew; Fagherazzi, Guy.

JMIR Med Inform ; 10(7): e37201, 2022 Jul 19.

Article in English | MEDLINE | ID: mdl-35852829

ABSTRACT

BACKGROUND: Intervening in and preventing diabetes distress requires an understanding of its causes and, in particular, from a patient's perspective. Social media data provide direct access to how patients see and understand their disease and consequently show the causes of diabetes distress. OBJECTIVE: Leveraging machine learning methods, we aim to extract both explicit and implicit cause-effect relationships in patient-reported diabetes-related tweets and provide a methodology to better understand the opinions, feelings, and observations shared within the diabetes online community from a causality perspective. METHODS: More than 30 million diabetes-related tweets in English were collected between April 2017 and January 2021. Deep learning and natural language processing methods were applied to focus on tweets with personal and emotional content. A cause-effect tweet data set was manually labeled and used to train (1) a fine-tuned BERTweet model to detect causal sentences containing a causal relation and (2) a conditional random field model with Bidirectional Encoder Representations from Transformers (BERT)-based features to extract possible cause-effect associations. Causes and effects were clustered in a semisupervised approach and visualized in an interactive cause-effect network. RESULTS: Causal sentences were detected with a recall of 68% in an imbalanced data set. A conditional random field model with BERT-based features outperformed a fine-tuned BERT model for cause-effect detection with a macro recall of 68%. This led to 96,676 sentences with cause-effect relationships. "Diabetes" was identified as the central cluster followed by "death" and "insulin." Insulin pricing-related causes were frequently associated with death. CONCLUSIONS: A novel methodology was developed to detect causal sentences and identify both explicit and implicit, single and multiword cause, and the corresponding effect, as expressed in diabetes-related tweets leveraging BERT-based architectures and visualized as cause-effect network. Extracting causal associations in real life, patient-reported outcomes in social media data provide a useful complementary source of information in diabetes research.

Improving Diabetes-Related Biomedical Literature Exploration in the Clinical Decision-making Process via Interactive Classification and Topic Discovery: Methodology Development Study.

Ahne, Adrian; Fagherazzi, Guy; Tannier, Xavier; Czernichow, Thomas; Orchard, Francisco.

J Med Internet Res ; 24(1): e27434, 2022 01 18.

Article in English | MEDLINE | ID: mdl-35040795

ABSTRACT

BACKGROUND: The amount of available textual health data such as scientific and biomedical literature is constantly growing and becoming more and more challenging for health professionals to properly summarize those data and practice evidence-based clinical decision making. Moreover, the exploration of unstructured health text data is challenging for professionals without computer science knowledge due to limited time, resources, and skills. Current tools to explore text data lack ease of use, require high computational efforts, and incorporate domain knowledge and focus on topics of interest with difficulty. OBJECTIVE: We developed a methodology able to explore and target topics of interest via an interactive user interface for health professionals with limited computer science knowledge. We aim to reach near state-of-the-art performance while reducing memory consumption, increasing scalability, and minimizing user interaction effort to improve the clinical decision-making process. The performance was evaluated on diabetes-related abstracts from PubMed. METHODS: The methodology consists of 4 parts: (1) a novel interpretable hierarchical clustering of documents where each node is defined by headwords (words that best represent the documents in the node), (2) an efficient classification system to target topics, (3) minimized user interaction effort through active learning, and (4) a visual user interface. We evaluated our approach on 50,911 diabetes-related abstracts providing a hierarchical Medical Subject Headings (MeSH) structure, a unique identifier for a topic. Hierarchical clustering performance was compared against the implementation in the machine learning library scikit-learn. On a subset of 2000 randomly chosen diabetes abstracts, our active learning strategy was compared against 3 other strategies: random selection of training instances, uncertainty sampling that chooses instances about which the model is most uncertain, and an expected gradient length strategy based on convolutional neural networks (CNNs). RESULTS: For the hierarchical clustering performance, we achieved an F1 score of 0.73 compared to 0.76 achieved by scikit-learn. Concerning active learning performance, after 200 chosen training samples based on these strategies, the weighted F1 score of all MeSH codes resulted in a satisfying 0.62 F1 score using our approach, 0.61 using the uncertainty strategy, 0.63 using the CNN, and 0.45 using the random strategy. Moreover, our methodology showed a constant low memory use with increased number of documents. CONCLUSIONS: We proposed an easy-to-use tool for health professionals with limited computer science knowledge who combine their domain knowledge with topic exploration and target specific topics of interest while improving transparency. Furthermore, our approach is memory efficient and highly parallelizable, making it interesting for large Big Data sets. This approach can be used by health professionals to gain deep insights into biomedical literature to ultimately improve the evidence-based clinical decision making process.

Subject(s)

Diabetes Mellitus , Medical Subject Headings , Clinical Decision-Making , Diabetes Mellitus/therapy , Humans , Neural Networks, Computer , PubMed

Insulin pricing and other major diabetes-related concerns in the USA: a study of 46 407 tweets between 2017 and 2019.

Ahne, Adrian; Orchard, Francisco; Tannier, Xavier; Perchoux, Camille; Balkau, Beverley; Pagoto, Sherry; Harding, Jessica Lee; Czernichow, Thomas; Fagherazzi, Guy.

BMJ Open Diabetes Res Care ; 8(1)2020 06.

Article in English | MEDLINE | ID: mdl-32503810

ABSTRACT

INTRODUCTION: Little research has been done to systematically evaluate concerns of people living with diabetes through social media, which has been a powerful tool for social change and to better understand perceptions around health-related issues. This study aims to identify key diabetes-related concerns in the USA and primary emotions associated with those concerns using information shared on Twitter. RESEARCH DESIGN AND METHODS: A total of 11.7 million diabetes-related tweets in English were collected between April 2017 and July 2019. Machine learning methods were used to filter tweets with personal content, to geolocate (to the USA) and to identify clusters of tweets with emotional elements. A sentiment analysis was then applied to each cluster. RESULTS: We identified 46 407 tweets with emotional elements in the USA from which 30 clusters were identified; 5 clusters (18% of tweets) were related to insulin pricing with both positive emotions (joy, love) referring to advocacy for affordable insulin and sadness emotions related to the frustration of insulin prices, 5 clusters (12% of tweets) to solidarity and support with a majority of joy and love emotions expressed. The most negative topics (10% of tweets) were related to diabetes distress (24% sadness, 27% anger, 21% fear elements), to diabetic and insulin shock (45% anger, 46% fear) and comorbidities (40% sadness). CONCLUSIONS: Using social media data, we have been able to describe key diabetes-related concerns and their associated emotions. More specifically, we were able to highlight the real-world concerns of insulin pricing and its negative impact on mood. Using such data can be a useful addition to current measures that inform public decision making around topics of concern and burden among people with diabetes.

Subject(s)

Diabetes Mellitus , Social Media , Costs and Cost Analysis , Diabetes Mellitus/drug therapy , Diabetes Mellitus/epidemiology , Emotions , Humans , Insulin , United States/epidemiology

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL