Learning Domain-Specific Word Embeddings from COVID-19 Tweets

Aigbe, S. A.; Eick, C.

Aigbe, S. A.; Eick, C..

2021 IEEE International Conference on Big Data, Big Data 2021 ; : 4307-4312, 2021.

Article in English | Scopus | ID: covidwho-1730888

ABSTRACT

ABSTRACT

The COVID-19 global pandemic has been a major catastrophic event that impacted the world's economy. During the pandemic there was a rise in the use of social media such as Twitter by people to express their reactions and responses to the global pandemic. This drove researchers to analyze these micro-blogging texts, using natural language processing (NLP) methods, to understand information inherent in those texts. Most of these NLP tasks employ the use of word embeddings in training neural network models. These word embeddings are mainly trained on general text corpus which produce sub-optimal performance when used in domain-specific NLP tasks such as in COVID-19 related tweets. In this paper, we present a learned COVID-19 tweets domain-specific word embeddings for use in COVID-19 related tweets NLP tasks. Our evaluation results show that our domain-specific COVID-19 tweets word embeddings perform better than pretrained general word embeddings in a downstream domain-specific NLP task. Our COVID-19 tweets word embeddings are available for use by researchers who wish to perform downstream NLP tasks with pretrained domain-specific COVID-19 tweets word embeddings. © 2021 IEEE.

Keywords

COVID-19; Domain-Specific; Tweets; Word Embeddings; Embeddings; Social networking (online); Catastrophic event; Domain specific; Down-stream; Micro blogging; Social media; Tweet; Word embedding; World economy; Natural language processing systems

Fulltext

XML

Search on Google

Full text: Available Collection: Databases of international organizations Database: Scopus Language: English Journal: 2021 IEEE International Conference on Big Data, Big Data 2021 Year: 2021 Document Type: Article

Similar

MEDLINE

LILACS

LIS

Fulltext

XML

Search on Google