ABSTRACT
Messaging platforms like WhatsApp are some of the largest contributors to the spread of Covid-19 health misinformation but they also play a critical role in disseminating credible information and reaching populations at scale. This study explores the relationships between verification behaviours and intention to share information to users that report high trust in their personal network and users that report high trust in authoritative sources. The study was conducted as a survey delivered through WhatsApp to users of the WHO HealthAlert chatbot service. An adapted theoretical model from news verification behaviours was used to determine the correlation between the constructs. Due to an excellent response, 5477 usable responses were obtained, so the adapted research model could be tested by means of a Structural Equation Model (SEM) using the partial least squares algorithm on SmartPLS4. The findings suggest significant correlations between the constructs and suggest that participants that have reported high levels of trust in authoritative sources are less likely to share information due to their increased behaviours to verify information. © 2023 IEEE.
ABSTRACT
On social media, misinformation can spread quickly, posing serious problems. Understanding the content and sensitive nature of fake news and misinformation is critical to prevent the damage caused by them. To this end, the characteristics of information must first be discerned. In this paper, we propose a transformer-based hybrid ensemble model to detect misinformation on the Internet. First, false and true news on Covid-19 were analyzed, and various text classification tasks were performed to understand their content. The results were utilized in the proposed hybrid ensemble learning model. Our analysis revealed promising results, establishing the capability of the proposed system to detect misinformation on social media. The final model exhibited an excellent F1 score (0.98) and accuracy (0.97). The AUC (Area Under The Curve) score was also high at 0.98, and the ROC (Receiver Operating Characteristics) curve revealed that the true-positive rate of the data was close to one in this model. Thus, the proposed hybrid model was demonstrated to be successful in recognizing false information online. © 2022 IEEE.
ABSTRACT
Online misinformation has become a major concern in recent years, and it has been further emphasized during the COVID-19 pandemic. Social media platforms, such as Twitter, can be serious vectors of misinformation online. In order to better understand the spread of these fake-news, lies, deceptions, and rumours, we analyze the correlations between the following textual features in tweets: emotion, sentiment, political bias, stance, veracity and conspiracy theories. We train several transformer-based classifiers from multiple datasets to detect these textual features and identify potential correlations using conditional distributions of the labels. Our results show that the online discourse regarding some topics, such as COVID-19 regulations or conspiracy theories, is highly controversial and reflects the actual U.S. political landscape. © 2023 ACM.
ABSTRACT
Online news and information sources are convenient and accessible ways to learn about current issues. For instance, more than 300 million people engage with posts on Twitter globally, which provides the possibility to disseminate misleading information. There are numerous cases where violent crimes have been committed due to fake news. This research presents the CovidMis20 dataset (COVID-19 Misinformation 2020 dataset), which consists of 1,375,592 tweets collected from February to July 2020. CovidMis20 can be automatically updated to fetch the latest news and is publicly available at: https://github.com/everythingguy/CovidMis20. This research was conducted using Bi-LSTM deep learning and an ensemble CNN+Bi-GRU for fake news detection. The results showed that, with testing accuracy of 92.23% and 90.56%, respectively, the ensemble CNN+Bi-GRU model consistently provided higher accuracy than the Bi-LSTM model. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
ABSTRACT
Social media has become a source of information for many people because of its freedom of use. As a result, fake news spread quickly and easily, regardless of its credibility, especially over the past decade. The vast amount of information being shared has fraudulent practices that negatively affect readers' cognitive abilities and mental health. In this study, we aim to introduce a new Arabic COVID-19 dataset for fake news related to COVID-19 from Twitter and Facebook. Afterward, we applied two pre-Trained models of classification AraBERT and BERT base Arabic. As a result, AraBERT models obtained better accuracy than BERT base Arabic in two datasets. © 2022 IEEE.
ABSTRACT
While COVID-19 text misinformation has already been investigated by various scholars, fewer research efforts have been devoted to characterizing and understanding COVID-19 misinformation that is carried out through visuals like photographs and memes. In this paper, we present a mixed-method analysis of image-based COVID-19 misinformation in 2020 on Twitter. We deploy a computational pipeline to identify COVID-19 related tweets, download the images contained in them, and group together visually similar images. We then develop a codebook to characterize COVID-19 misinformation and manually label images as misinformation or not. Finally, we perform a quantitative analysis of tweets containing COVID-19 misinformation images. We identify five types of COVID-19 misinformation, from a wrong understanding of the threat severity of COVID-19 to the promotion of fake cures and conspiracy theories. We also find that tweets containing COVID-19 misinformation images do not receive more interactions than baseline tweets with random images posted by the same set of users. As for temporal properties, COVID-19 misinformation images are shared for longer periods of time than non-misinformation ones, as well as have longer burst times. we compare non-misinformation images instead of random images, and so it is not a direct comparison. When looking at the users sharing COVID-19 misinformation images on Twitter from the perspective of their political leanings, we find that pro-Democrat and pro-Republican users share a similar amount of tweets containing misleading or false COVID-19 images. However, the types of images that they share are different: while pro-Democrat users focus on misleading claims about the Trump administration's response to the pandemic, as well as often sharing manipulated images intended as satire, pro-Republican users often promote hydroxychloroquine, an ineffective medicine against COVID-19, as well as conspiracy theories about the origin of the virus. Our analysis sets a basis for better understanding COVID-19 misinformation images on social media and the nuances in effectively moderate them. © 2023 ACM.
ABSTRACT
Nowadays, the usage of social media platforms is rapidly increasing, and rumours or false information are also rising, especially among Arab nations. This false information is harmful to society and individuals. Blocking and detecting the spread of fake news in Arabic becomes critical. Several artificial intelligence (AI) methods, including contemporary transformer techniques, BERT, were used to detect fake news. Thus, fake news in Arabic is identified by utilizing AI approaches. This article develops a new hunter-prey optimization with hybrid deep learning-based fake news detection (HPOHDL-FND) model on the Arabic corpus. The HPOHDL-FND technique undergoes extensive data pre-processing steps to transform the input data into a useful format. Besides, the HPOHDL-FND technique utilizes long-term memory with a recurrent neural network (LSTM-RNN) model for fake news detection and classification. Finally, hunter prey optimization (HPO) algorithm is exploited for optimal modification of the hyperparameters related to the LSTM-RNN model. The performance validation of the HPOHDL-FND technique is tested using two Arabic datasets. The outcomes exemplified better performance over the other existing techniques with maximum accuracy of 96.57% and 93.53% on Covid19Fakes and satirical datasets, respectively. © 2023 Tech Science Press. All rights reserved.
ABSTRACT
On social media, false information can proliferate quickly and cause big issues. To minimize the harm caused by false information, it is essential to comprehend its sensitive nature and content. To achieve this, it is necessary to first identify the characteristics of information. To identify false information on the internet, we suggest an ensemble model based on transformers in this paper. First, various text classification tasks were carried out to understand the content of false and true news on Covid-19. The proposed hybrid ensemble learning model used the results. The results of our analysis were encouraging, demonstrating that the suggested system can identify false information on social media. All the classification tasks were validated and shows outstanding results. The final model showed excellent accuracy (0.99) and F1 score (0.99). The Receiver Operating Characteristics (ROC) curve showed that the true-positive rate of the data in this model was close to one, and the AUC (Area Under The Curve) score was also very high at 0.99. Thus, it was shown that the suggested model was effective at identifying false information online. © 2023, EasyChair. All rights reserved.
ABSTRACT
Social networks have had a significant impact on people's personal and professional life all around the world. Since the COVID-19 pandemic has boosted the use of digital media among people, fake news and reviews have had a stronger impact on society in recent years. This study demonstrates how the stiffness index may be used to model the spread of fake news in Indian states. We demonstrate that the speed at which fake news circulates through online social networks increases with a stiffness index. We conducted a stiffness analysis for all Indian states to assess the spread of fake information in each Indian state. The stiffness analysis of the conventional SIR model, one of the widely used approaches to describe the propagation of rumors in social networks, serves as an explanation and illustration of our proposition. The rise in fake news in our society is also justified by a comparison of the stiffness index for India before and after the COVID-19 outbreak. The study provides governments and policymakers with a more comprehensive understanding of the value of early intervention to combat the spread of false information via digital media. © 2023 IEEE.
ABSTRACT
With cyberspace's continuous evolution, online reviews play a crucial role in determining business success in various sectors, ranging from restaurants and hotels to e-commerce applications. Typically, a favorable review for a specific product draws in more consumers and results in a significant boost in sales. Unfortunately, a few businesses are using deceptive methods to improve their online reputation by using fake reviews of competitors. As a result, detecting fake reviews has become a difficult and ever-changing research field. Verbal characteristics extracted from review text, as well as nonverbal features such as the reviewer's engagement metrics, the IP address of the device, and so on, play an important role in detecting fake reviews. This article examines and compares various machine learning techniques for detecting deceptive reviews on various online platforms such as e-commerce websites such as Amazon and online review websites such as Yelp, among others. © 2023 IEEE.
ABSTRACT
COVID-19 pandemic has been impacting people's everyday life for more than two years. With the fast spreading of online communication and social media platforms, the number of fake news related to COVID-19 is in a rapid growth and propagates misleading information to the public. To tackle this challenge and stop the spreading of fake news, this project proposes to build an online software detector specifically for COVID-19 news to classify whether the news is trustworthy. Specifically, as it is difficult to train a generic model for all domains, a base model is developed and fine-tuned to adapt the specific domain context. In addition, a data collection mechanism is developed to get latest COVID-19 news data and to keep the model fresh. We then conducted performance comparisons among different models using traditional machine learning techniques, ensemble machine learning, and the state-of-the-art deep learning mechanism. The most effective model is deployed to our online website for COVID-19 related fake news detection. © 2023 IEEE.
ABSTRACT
Since its emergence in December 2019, there have been numerous news of COVID-19 pandemic shared on social media, which contain information from both reliable and unreliable medical sources. News and misleading information spread quickly on social media, which can lead to anxiety, unwanted exposure to medical remedies, etc. Rapid detection of fake news can reduce their spread. In this paper, we aim to create an intelligent system to detect misleading information about COVID-19 using deep learning techniques based on LSTM and BLSTM architectures. Data used to construct the DL models are text type and need to be transformed to numbers. We test, in this paper the efficiency of three vectorization techniques: Bag of words, Word2Vec and Bert. The experimental study showed that the best performance was given by LSTM model with BERT by achieving an accuracy of 91% of the test set. © 2023 IEEE.
ABSTRACT
False claims or Fake News related to the health care or medicine field on Social Media have garnered increasing amounts of interest, especially in the aftermath of the COVID-19 pandemic. False claims about the pan-demic which spread on social media have contributed to vaccine hesitancy and lack of trust in the advise of medical professionals. If not detected and disproved early, such claims can complicate future pandemic responses. We focus on false claims in the field of Neurodevelopmental Disorders (NDDs), which is an umbrella term for a group of disorders that includes Autism, ADHD, Cerebral Palsy, etc. In this paper we present our approach to automated systems for fact-checking medical articles related to NDDs. We also present an annotated dataset of 116 web pages which we use to test our model and present our results. © 2022 IEEE.
ABSTRACT
The Covid-19 pandemic has increased the global dependency on the internet. Millions of individuals use social networking sites to not only share information, but also their personal opinions. These facts and opinions are frequently unconfirmed, which result in the spread of incorrect information, generally alluded to as "Fake Content”. The most challenging aspect of social media is in determining the source of information. It's difficult to figure out who generated fake news once it's gone viral. Most available computational models have a key flaw in that they rely on the presence of inaccurate information to generate meaningful features, making disinformation mitigation measures difficult to predict. This paper presents a parallel approach to false information mitigation drawn from the field of Epidemiology using SIR(Susceptible, Infected, Recovered) to model the impact of fake data dissemination during Covid-19. SIR simulation is done using NetLogo in which the population is made up of two agents: Fake news believers and non-believers. To confirm our work, the concept of trust is also discussed which is a fundamental component of any fake news interaction. The level of trust can be expressed by assigning each node a pair of trust scores. We ran our experiments based on three common evaluation metrics: Accuracy, Precision, and Recall. The hybrid model shows an increase in accuracy by 81.4%, 77.1%, and 91.8% for the respective networks. © 2023, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
ABSTRACT
With the advancement in technology, web technol-ogy in the form of social media is one of the main origins of information worldwide. Web technology has helped people to enhance their ability to know, learn, and gain knowledge about things around them. The benefits that technological advancement offers are boundless. However, apart from these, social media also has major issues related to problems and challenges concerning filtering out the right information from the wrong ones. The sources of information become highly unreliable at times, and it is difficult to differentiate and decipher real news or real information from fake ones. Cybercrime, through fraud mechanisms, is a pervasive menace permeating media technology every single day. Hence, this article reports an attempt to fake news detection in Khasi social media data. To execute this work, the data analyzed are extracted from different Internet platforms mainly from social media articles and posts. The dataset consists of fake news and also real news based on COVID-19, and also other forms of wrong information disseminated throughout the pandemic period. We have manually annotated the assembled Khasi news and the data set consists of 116 news data. We have used three machine learning techniques in our experiment, the Decision Tree, the Logistic Regression, and the Random Forest approach. We have observed in the experimental results that the Decision Tree-based approach yielded accurate results with an accuracy of 87%, whereas the Logistic Regression approach yielded an accuracy of 82% and the Random Forest approach yielded an accuracy of 75%. © 2023 IEEE.
ABSTRACT
The COVID-19 pandemic has severely harmed every aspect of our daily lives, resulting in a slew of social problems. Therefore, it is critical to accurately assess the current state of community functionality and resilience under this pandemic for successful recovery. To this end, various types of social sensing tools, such as tweeting and publicly released news, have been employed to understand individuals' and communities' thoughts, behaviors, and attitudes during the COVID-19 pandemic. However, some portions of the released news are fake and can easily mislead the community to respond improperly to disasters like COVID-19. This paper aims to assess the correlation between various news and tweets collected during the COVID-19 pandemic on community functionality and resilience. We use fact-checking organizations to classify news as real, mixed, or fake, and machine learning algorithms to classify tweets as real or fake to measure and compare community resilience (CR). Based on the news articles and tweets collected, we quantify CR based on two key factors, community wellbeing and resource distribution, where resource distribution is assessed by the level of economic resilience and community capital. Based on the estimates of these two factors, we quantify CR from both news articles and tweets and analyze the extent to which CR measured from the news articles can reflect the actual state of CR measured from tweets. To improve the operationalization and sociological significance of this work, we use dimension reduction techniques to integrate the dimensions. © 2020 Tsinghua University Press.
ABSTRACT
The impact of technology on people's lives has grown continuously. The consumption of online news is one of the important trends as the share of population with internet access grows rapidly over time. Global statistics have shown that the internet and social media usage has an increasing trend. Recent developments like the Covid 19 pandemic have amplified this trend even more. However, the credibility of online news is a very critical issue to consider since it directly impacts the society and the people's mindsets. Majority of users tend to instinctively believe what they encounter and come into conclusions based upon them. It is essential that the consumers have an understanding or prior knowledge regarding the news and its source before coming into conclusions. This research proposes a hybrid model to predict the accuracy of a particular news article in Sinhala text. The model combines the general news content based analysis techniques using machine learning/ deep learning classifiers with social network related features of the news source to make predictions. A scoring mechanism is utilized to provide an overall score to a given news item where two independent scores- Accuracy Score (by analyzing the news content) and Credibility Score (by a scoring mechanism on social network features of the news source) are combined. The hybrid model containing the Passive Aggressive Classifier has shown the highest accuracy of 88%. Also, the models containing deep neural netWorks has shown accuracy around 75-80%. These results highlight that the proposed method could efficiently serve as a Fake News Detection mechanism for news content in Sinhala Language. Also, since there's no publicly available dataset for Fake News detection in Sinhala, the datasets produced in this work could also be considered as a contribution from this research. © 2022 IEEE.
ABSTRACT
Classifying whether collected information related to emerging topics and domains is fake/incorrect is not an easy task because we do not have enough labeled data in the domains. Given labeled data from source domains (e.g., gossip and health) and limited labeled data from a newly emerging target domain (e.g., COVID-19 and Ukraine war), simply applying knowledge learned from source domains to the target domain may not work well because of different data distribution. To solve the problem, in this paper, we propose an energy-based domain adaptation with active learning for early misinformation detection. Given three real world news datasets, we evaluate our proposed model against two baselines in both domain adaptation and the whole pipeline. Our model outperforms the baselines, improving at least 5% in the domain adaptation task and 10% in the whole pipeline, showing effectiveness of our proposed approach. © 2022 IEEE.
ABSTRACT
Along with the unprecedented impact of the COVID-19 pandemic on human lives, a new crisis of fake and false information related to disease has also emerged. Primarily, social media platforms such as Twitter are used to disseminate fake information due to ease of access and their large audience. However, automatic detection and classification of fake tweets is challenging task due to the complexity and lack of contextual features of short text. This paper proposes a novel CoviFake framework to classify and analyze fake tweets related to COVID-19 using vocabulary and non-vocabulary features. For this purpose, first, we combine and enhance 'CTF' and 'COVID19 Rumor' datasets to build our COVID19-sham dataset containing 25,388 labelled tweets. Next, we extract the vocabulary and 12 non-vocabulary features to compare the performance of six state-of-the-art machine learning classifiers. Our results highlight that the Random Forest (RF) classifier achieves the highest accuracy of 94.53% with the combination of top 2,000 vocabulary and 12 non-vocabulary features. In addition, we developed a large-scale dataset of CoviTweets containing 7.88 million English tweets posted by 3.8 million users during two months (March-April, 2020). The analysis of CoviTweets leveraging our framework reveals that the dataset contains 1.64 million (20.87%) fake tweets. Furthermore, we perform an in-depth examination by assigning a 'fakeness score' to hashtags and users in CoviTweets. © 2022 IEEE.
ABSTRACT
The Covid-19 pandemic has caused a dramatic and parallel rise in dangerous misinformation, denoted an 'infodemic' by the CDC and WHO. Misinformation tied to the Covid-19 infodemic changes continuously;this can lead to performance degradation of fine-tuned models due to concept drift. Degredation can be mitigated if models generalize well-enough to capture some cyclical aspects of drifted data. In this paper, we explore generalizability of pre-trained and fine-tuned fake news detectors across 9 fake news datasets. We show that existing models often overfit on their training dataset and have poor performance on unseen data. However, on some subsets of unseen data that overlap with training data, models have higher accuracy. Based on this observation, we also present KMeans-Proxy, a fast and effective method based on K-Means clustering for quickly identifying these overlapping subsets of unseen data. KMeans-Proxy improves generalizability on unseen fake news datasets by 0.1-0.2 f1-points across datasets. We present both our generalizability experiments as well as KMeans-Proxy to further research in tackling the fake news problem. © 2022 IEEE.