ABSTRACT
Accurate prediction of 2019 novel coronavirus diseases (COVID-19) has been playing an important role in making more effective prevention and control policies during pandemic crises. The aim of this paper was to develop an innovative stacking based prediction of COVID-19 pandemic cumulative confirmed cases (StackCPPred) by integrating infectious disease dynamics model and traditional machine learning. Based on population migration characteristics, five feature indicators were first extracted from the population flow data in the early stage of this epidemic, which were collected from the National Health Commission of the People's Republic of China. Then, stacking based ensemble learning (SEL) model was established for COVID-19 prediction using traditional machine learning, including the multiple linear regression (MLR) and the tree regression model (XGBoost and LightGBM). By introducing the variable "death state", an improved Susceptible-Infected-Recovered (ISIR) model was established. Finally, a hybrid model, StackCPPred was proposed by incorporating the ISIR model outputs and the five feature indicators into the SEL model. Real data on population movements and daily cumulative number of newly confirmed cases across the country from January 23 to February 6 were used to validate our model. The results positively proved that the proposed StackCPPred model outperformed the existing models for COVID-19 prediction, as quantified by the root mean square error (RMSE), the root mean square logarithmic error (RMSLE) and the coefficient of determination (R2) (g1/41841 persons, g1/40.1 and >0.9, respectively). Furthermore, this study confirms the validity and usefulness of the StackCPPred model for COVID-19 prediction. © 2022 ACM.
ABSTRACT
More and more studies have evaluated the associations between ambient temperature and coronavirus disease 2019 (COVID-19). However, most of these studies were rushed to completion, rendering the quality of their findings questionable. We systematically evaluated 70 relevant peer-reviewed studies published on or before 21 September 2020 that had been implemented from community to global level. Approximately 35 of these reports indicated that temperature was significantly and negatively associated with COVID-19 spread, whereas 12 reports demonstrated a significantly positive association. The remaining studies found no association or merely a piecewise association. Correlation and regression analyses were the most commonly utilized statistical models. The main shortcomings of these studies included uncertainties in COVID-19 infection rate, problems with data processing for temperature, inappropriate controlling for confounding parameters, weaknesses in evaluation of effect modification, inadequate statistical models, short research periods, and the choices of research areal units. It is our viewpoint that most studies of the identified 70 publications have had significant flaws that have prevented them from providing a robust scientific basis for the association between temperature and COVID-19.