ABSTRACT
The global COVID-19 pandemic increased social media usage to obtain information and to share concerns, feelings, and emotions, turning it into a prolific field of research through which it is possible to understand how audiences are coping with the multitude of recent challenges. This paper presents results from a social media analysis of 61532 education-related news headlines posted by the major daily news provider in Portugal, Sic Notícias, on Facebook, from January to December 2020. We focus on how the news impacted on audiences' emotional response and discourse, and we analyze the key issues of the most commented news content. The results show a prevailing sadness among audiences and a very negative discourse all throughout 2020, with a high degree uncertainty being expressed. The main concerns revolved around parents supporting children in their first remote learning endeavors, financial sustainability, the lack of devices, the disinfection of schools, and the students' mobility, particularly in the non-higher education context.
ABSTRACT
The impact of technology on people's lives has grown continuously. The consumption of online news is one of the important trends as the share of population with internet access grows rapidly over time. Global statistics have shown that the internet and social media usage has an increasing trend. Recent developments like the Covid 19 pandemic have amplified this trend even more. However, the credibility of online news is a very critical issue to consider since it directly impacts the society and the people's mindsets. Majority of users tend to instinctively believe what they encounter and come into conclusions based upon them. It is essential that the consumers have an understanding or prior knowledge regarding the news and its source before coming into conclusions. This research proposes a hybrid model to predict the accuracy of a particular news article in Sinhala text. The model combines the general news content based analysis techniques using machine learning/ deep learning classifiers with social network related features of the news source to make predictions. A scoring mechanism is utilized to provide an overall score to a given news item where two independent scores- Accuracy Score (by analyzing the news content) and Credibility Score (by a scoring mechanism on social network features of the news source) are combined. The hybrid model containing the Passive Aggressive Classifier has shown the highest accuracy of 88%. Also, the models containing deep neural netWorks has shown accuracy around 75-80%. These results highlight that the proposed method could efficiently serve as a Fake News Detection mechanism for news content in Sinhala Language. Also, since there's no publicly available dataset for Fake News detection in Sinhala, the datasets produced in this work could also be considered as a contribution from this research. © 2022 IEEE.
ABSTRACT
The increasingly rapid spread of information about COVID-19 on the web calls for automatic measures of credibility assessment [18]. If large parts of the population are expected to act responsibly during a pandemic, they need information that can be trusted [20]. In that context, we model the credibility of texts using 25 linguistic phenomena, such as spelling, sentiment and lexical diversity. We integrate these measures in a graphical interface and present two empirical studies to evaluate its usability for credibility assessment on COVID-19 news. Raw data for the studies, including all questions and responses, has been made available to the public using an open license: https://github.com/konstantinschulz/credible-covid-ux. The user interface prominently features three sub-scores and an aggregation for a quick overview. Besides, metadata about the concept, authorship and infrastructure of the underlying algorithm is provided explicitly. Our working definition of credibility is operationalized through the terms of trustworthiness, understandability, transparency, and relevance. Each of them builds on well-established scientific notions [41, 65, 68] and is explained orally or through Likert scales. In a moderated qualitative interview with six participants, we introduce information transparency for news about COVID-19 as the general goal of a prototypical platform, accessible through an interface in the form of a wireframe [43]. The participants' answers are transcribed in excerpts. Then, we triangulate inductive and deductive coding methods [19] to analyze their content. As a result, we identify rating scale, sub-criteria and algorithm authorship as important predictors of the usability. In a subsequent quantitative online survey, we present a questionnaire with wireframes to 50 crowdworkers. The question formats include Likert scales, multiple choice and open-ended types. This way, we aim to strike a balance between the known strengths and weaknesses of open vs. closed questions [11]. The answers reveal a conflict between transparency and conciseness in the interface design: Users tend to ask for more information, but do not necessarily make explicit use of it when given. This discrepancy is influenced by capacity constraints of the human working memory [38]. Moreover, a perceived hierarchy of metadata becomes apparent: the authorship of a news text is more important than the authorship of the algorithm used to assess its credibility. From the first to the second study, we notice an improved usability of the aggregated credibility score's scale. That change is due to the conceptual introduction before seeing the actual interface, as well as the simplified binary indicators with direct visual support. Sub-scores need to be handled similarly if they are supposed to contribute meaningfully to the overall credibility assessment. By integrating detailed information about the employed algorithm, we are able to dissipate the users' doubts about its anonymity and possible hidden agendas. However, the overall transparency can only be increased if other more important factors, like the source of the news article, are provided as well. Knowledge about this interaction enables software designers to build useful prototypes with a strong focus on the most important elements of credibility: source of text and algorithm, as well as distribution and composition of algorithm. All in all, the understandability of our interface was rated as acceptable (78% of responses being neutral or positive), while transparency (70%) and relevance (72%) still lag behind. This discrepancy is closely related to the missing article metadata and more meaningful visually supported explanations of credibility sub-scores. The insights from our studies lead to a better understanding of the amount, sequence and relation of information that needs to be provided in interfaces for credibility assessment. In particular, our integration of software metadata contributes to the more holistic notion of credibility [47, 72] that has become popular in recent years Besides, it paves the way for a more thoroughly informed interaction between humans and machine-generated assessments, anticipating the users' doubts and concerns [39] in early stages of the software design process [37]. Finally, we make suggestions for future research, such as proactively documenting credibility-related metadata for Natural Language Processing and Language Technology services and establishing an explicit hierarchical taxonomy of usability predictors for automatic credibility assessment. © 2022, Springer Nature Switzerland AG.
ABSTRACT
The incredible growth in available news content has been met with steeply increasing demand for news amongst the general population. The 24/7 news cycle gives people an awareness of events, activities and decisions that may have an impact on them (e.g. the latest updates on the COVID-19 outbreak). Despite the flourish of social networks, recent research suggests radio and especially TV are still the main sources of news for many people. However, unlike in social media, the content aired on radio and TV requires people to listen to every single advertisement and music (for radio) before consuming the next item. For this reason, media monitoring companies have to dedicate considerable amount of resources on processing or manually filtering the advertising content (which is blended with the actual news). Often their clients still receive ads. To mitigate this problem, in this paper, we propose No2Ads, an autoregressive deep convolutional neural network (CNN) model that is trained on over 500 h of human annotated training samples to remove ads and music from broadcast content. No2Ads reached very high performance results in our tests, achieving 97% and 95% in precision and recall on detecting ads/music for radio channels;95% precision and 98% recall for TV channels. Between March to September 2021, across 261 radio and TV channels in Australia and New Zealand, No2Ads has detected and filtered out 22,161 h of all captured broadcast content as either advertisements or music. © 2022, Springer Nature Switzerland AG.
ABSTRACT
This paper focuses on a critical problem of explainable multimodal COVID-19 misinformation detection where the goal is to accurately detect misleading information in multimodal COVID-19 news articles and provide the reason or evidence that can explain the detection results. Our work is motivated by the lack of judicious study of the association between different modalities (e.g., text and image) of the COVID-19 news content in current solutions. In this paper, we present a generative approach to detect multimodal COVID-19 misinformation by investigating the cross-modal association between the visual and textual content that is deeply embedded in the multimodal news content. Two critical challenges exist in developing our solution: 1) how to accurately assess the consistency between the visual and textual content of a multimodal COVID-19 news article? 2) How to effectively retrieve useful information from the unreliable user comments to explain the misinformation detection results? To address the above challenges, we develop a duo-generative explainable misinformation detection (DGExplain) framework that explicitly explores the cross-modal association between the news content in different modalities and effectively exploits user comments to detect and explain misinformation in multimodal COVID-19 news articles. We evaluate DGExplain on two real-world multimodal COVID-19 news datasets. Evaluation results demonstrate that DGExplain significantly outperforms state-of-the-art baselines in terms of the accuracy of multimodal COVID-19 misinformation detection and the explainability of detection explanations. © 2022 ACM.
ABSTRACT
The COVID-19 pandemic poses a great threat to global public health. Meanwhile, there is massive misinformation associated with the pandemic which advocates unfounded or unscientific claims. Even major social media and news outlets have made an extra effort in debunking COVID-19 misinformation, most of the fact-checking information is in English, whereas some unmoderated COVID-19 misinformation is still circulating in other languages, threatening the health of less-informed people in immigrant communities and developing countries. In this paper, we make the first attempt to detect COVID-19 misinformation in a low-resource language (Chinese) only using the fact-checked news in a high-resource language (English). We start by curating a Chinese realfake news dataset according to existing fact-checking information. Then, we propose a deep learning framework named CrossFake to jointly encode the cross-lingual news body texts and capture the news content as much as possible. Empirical results on our dataset demonstrate the effectiveness of CorssFake under the cross-lingual setting and it also outperforms several monolingual and cross-lingual fake news detectors. The dataset is available at https://github.com/YingtongDou/CrossFake. © 2021 IEEE.