Topic2Labels: A framework to annotate and classify the social media data through LDA topics and deep learning models for crisis response
Expert Systems with Applications
; : 116562, 2022.
Article
in English
| ScienceDirect | ID: covidwho-1668841
ABSTRACT
The abundant use of social media impacts every aspect of life, including crisis management. Disaster management needs real-time data to be used in machine learning and deep learning models to aid their decision making. Mostly the data that is newly generated from social media is unstructured and unlabeled. Current text classification models based on supervised deep learning models heavily rely on human-labeled data that very small size and imbalanced in the context of disasters, ultimately affecting the generalization of models. In this study, we propose Topic2labels (T2L) framework which provides an automated way of labeling the data through LDA (latent dirichlet allocation) topic modelling approach and utilize Bert (the bidirectional encoder representation from transformer) embeddings for construction of feature vector to be employed to classify the data contextually. Our framework consists of three layers. In the first layer, we adopt LDA to generate the topics from the data, and develop a new algorithm to rank the topics, and map the highest ranked dominant topic into label to annotate the data. In the second layer, we transform the labeled text into feature representation through Bert embeddings and in the third layer we leveraged deep learning models as classifiers to classify the textual data into multiple categories. Experimental results on crisis-related datasets show that our framework performs better in terms of classification performance and yields improvement as compared to other baseline approaches.
Full text:
Available
Collection:
Databases of international organizations
Database:
ScienceDirect
Language:
English
Journal:
Expert Systems with Applications
Year:
2022
Document Type:
Article
Similar
MEDLINE
...
LILACS
LIS