Your browser doesn't support javascript.
Digital Disease Surveillance for Emerging Infectious Diseases: An Early Warning System Using the Internet and Social Media Data for COVID-19 Forecasting in Canada.
Yang, Yang; Tsao, Shu-Feng; Basri, Mohammad A; Chen, Helen H; Butt, Zahid A.
  • Yang Y; School of Public Health Sciences, University of Waterloo, Canada.
  • Tsao SF; School of Public Health Sciences, University of Waterloo, Canada.
  • Basri MA; Systems Design Engineering, University of Waterloo, Canada.
  • Chen HH; School of Public Health Sciences, University of Waterloo, Canada.
  • Butt ZA; School of Public Health Sciences, University of Waterloo, Canada.
Stud Health Technol Inform ; 302: 861-865, 2023 May 18.
Article in English | MEDLINE | ID: covidwho-2327217
ABSTRACT

BACKGROUND:

Emerging Infectious Diseases (EID) are a significant threat to population health globally. We aimed to examine the relationship between internet search engine queries and social media data on COVID-19 and determine if they can predict COVID-19 cases in Canada.

METHODS:

We analyzed Google Trends (GT) and Twitter data from 1/1/2020 to 3/31/2020 in Canada and used various signal-processing techniques to remove noise from the data. Data on COVID-19 cases was obtained from the COVID-19 Canada Open Data Working Group. We conducted time-lagged cross-correlation analyses and developed the long short-term memory model for forecasting daily COVID-19 cases.

RESULTS:

Among symptom keywords, "cough," "runny nose," and "anosmia" were strong signals with high cross-correlation coefficients >0.8 ( rCough = 0.825, t - 9; rRunnyNose = 0.816, t - 11; rAnosmia = 0.812, t - 3 ), showing that searching for "cough," "runny nose," and "anosmia" on GT correlated with the incidence of COVID-19 and peaked 9, 11, and 3 days earlier than the incidence peak, respectively. For symptoms- and COVID-related Tweet counts, the cross-correlations of Tweet signals and daily cases were rTweetSymptoms = 0.868, t - 11 and tTweetCOVID = 0.840, t - 10, respectively. The LSTM forecasting model achieved the best performance (MSE = 124.78, R2 = 0.88, adjusted R2 = 0.87) using GT signals with cross-correlation coefficients >0.75. Combining GT and Tweet signals did not improve the model performance.

CONCLUSION:

Internet search engine queries and social media data can be used as early warning signals for creating a real-time surveillance system for COVID-19 forecasting, but challenges remain in modelling.
Subject(s)
Keywords

Full text: Available Collection: International databases Database: MEDLINE Main subject: Communicable Diseases, Emerging / Social Media / COVID-19 Type of study: Diagnostic study / Observational study / Prognostic study / Randomized controlled trials Topics: Long Covid Limits: Humans Language: English Journal: Stud Health Technol Inform Journal subject: Medical Informatics / Health Services Research Year: 2023 Document Type: Article Affiliation country: Shti230290

Similar

MEDLINE

...
LILACS

LIS


Full text: Available Collection: International databases Database: MEDLINE Main subject: Communicable Diseases, Emerging / Social Media / COVID-19 Type of study: Diagnostic study / Observational study / Prognostic study / Randomized controlled trials Topics: Long Covid Limits: Humans Language: English Journal: Stud Health Technol Inform Journal subject: Medical Informatics / Health Services Research Year: 2023 Document Type: Article Affiliation country: Shti230290