RESUMO
Multi-model and multi-team ensemble forecasts have become widely used to generate reliable short-term predictions of infectious disease spread. Notably, various public health agencies have used them to leverage academic disease modelling during the COVID-19 pandemic. However, ensemble forecasts are difficult to interpret and require extensive effort from numerous participating groups as well as a coordination team. In other fields, resource usage has been reduced by training simplified models that reproduce some of the observed behaviour of more complex models. Here we used observations of the behaviour of the European COVID-19 Forecast Hub ensemble combined with our own forecasting experience to identify a set of properties present in current ensemble forecasts. We then developed a parsimonious forecast model intending to mirror these properties. We assess forecasts generated from this model in real time over six months (the 15th of January 2022 to the 19th of July 2022) and for multiple European countries. We focused on forecasts of cases one to four weeks ahead and compared them to those by the European forecast hub ensemble. We find that the surrogate model behaves qualitatively similarly to the ensemble in many instances, though with increased uncertainty and poorer performance around periods of peak incidence (as measured by the Weighted Interval Score). The performance differences, however, seem to be partially due to a subset of time points, and the proposed model appears better probabilistically calibrated than the ensemble. We conclude that our simplified forecast model may have captured some of the dynamics of the hub ensemble, but more work is needed to understand the implicit epidemiological model that it represents.
RESUMO
BackgroundShort-term forecasts of infectious disease burden can contribute to situational awareness and aid capacity planning. Based on best practice in other fields and recent insights in infectious disease epidemiology, one can maximise the predictive performance of such forecasts if multiple models are combined into an ensemble. Here we report on the performance of ensembles in predicting COVID-19 cases and deaths across Europe between 08 March 2021 and 07 March 2022. MethodsWe used open-source tools to develop a public European COVID-19 Forecast Hub. We invited groups globally to contribute weekly forecasts for COVID-19 cases and deaths reported from a standardised source over the next one to four weeks. Teams submitted forecasts from March 2021 using standardised quantiles of the predictive distribution. Each week we created an ensemble forecast, where each predictive quantile was calculated as the equally-weighted average (initially the mean and then from 26th July the median) of all individual models predictive quantiles. We measured the performance of each model using the relative Weighted Interval Score (WIS), comparing models forecast accuracy relative to all other models. We retrospectively explored alternative methods for ensemble forecasts, including weighted averages based on models past predictive performance. ResultsOver 52 weeks we collected and combined up to 28 forecast models for 32 countries. We found a weekly ensemble had a consistently strong performance across countries over time. Across all horizons and locations, the ensemble performed better on relative WIS than 84% of participating models forecasts of incident cases (with a total N=862), and 92% of participating models forecasts of deaths (N=746). Across a one to four week time horizon, ensemble performance declined with longer forecast periods when forecasting cases, but remained stable over four weeks for incident death forecasts. In every forecast across 32 countries, the ensemble outperformed most contributing models when forecasting either cases or deaths, frequently outperforming all of its individual component models. Among several choices of ensemble methods we found that the most influential and best choice was to use a median average of models instead of using the mean, regardless of methods of weighting component forecast models. ConclusionsOur results support the use of combining forecasts from individual models into an ensemble in order to improve predictive performance across epidemiological targets and populations during infectious disease epidemics. Our findings further suggest that median ensemble methods yield better predictive performance more than ones based on means. Our findings also highlight that forecast consumers should place more weight on incident death forecasts than incident case forecasts at forecast horizons greater than two weeks. Code and data availabilityAll data and code are publicly available on Github: covid19-forecast-hub-europe/euro-hub-ensemble.
RESUMO
BackgroundEarly estimates from South Africa indicated that the Omicron COVID-19 variant may be both more transmissible and have greater immune escape than the previously dominant Delta variant. The rapid turnover of the latest epidemic wave in South Africa as well as initial evidence from contact tracing and household infection studies has prompted speculation that the generation time of the Omicron variant may be shorter in comparable settings than the generation time of the Delta variant. MethodsWe estimated daily growth rates for the Omicron and Delta variants in each UKHSA region from the 23rd of November to the 23rd of December 2021 using surveillance case counts by date of specimen and S-gene target failure status with an autoregressive model that allowed for time-varying differences in the transmission advantage of the Delta variant where the evidence supported this. By assuming a gamma distributed generation distribution we then estimated the generation time distribution and transmission advantage of the Omicron variant that would be required to explain this time varying advantage. We repeated this estimation process using two different prior estimates for the generation time of the Delta variant first based on household transmission and then based on its intrinsic generation time. ResultsVisualising our growth rate estimates provided initial evidence for a difference in generation time distributions. Assuming a generation time distribution for Delta with a mean of 2.5-4 days (90% credible interval) and a standard deviation of 1.9-3 days we estimated a shorter generation time distribution for Omicron with a mean of 1.5-3.2 days and a standard deviation of 1.3-4.6 days. This implied a transmission advantage for Omicron in this setting of 160%-210% compared to Delta. We found similar relative results using an estimate of the intrinsic generation time for Delta though all estimates increased in magnitude due to the longer assumed generation time. ConclusionsWe found that a reduction in the generation time of Omicron compared to Delta was able to explain the observed variation over time in the transmission advantage of the Omicron variant. However, this analysis cannot rule out the role of other factors such as differences in the populations the variants were mixing in, differences in immune escape between variants or bias due to using the test to test distribution as a proxy for the generation time distribution.
RESUMO
BackgroundForecasting healthcare demand is essential in epidemic settings, both to inform situational awareness and facilitate resource planning. Ideally, forecasts should be robust across time and locations. During the COVID-19 pandemic in England, it is an ongoing concern that demand for hospital care for COVID-19 patients in England will exceed available resources. MethodsWe made weekly forecasts of daily COVID-19 hospital admissions for National Health Service (NHS) Trusts in England between August 2020 and April 2021 using three disease-agnostic forecasting models: a mean ensemble of autoregressive time series models, a linear regression model with 7-day-lagged local cases as a predictor, and a scaled convolution of local cases and a delay distribution. We compared their point and probabilistic accuracy to a mean-ensemble of them all, and to a simple baseline model of no change from the last day of admissions. We measured predictive performance using the Weighted Interval Score (WIS) and considered how this changed in different scenarios (the length of the predictive horizon, the date on which the forecast was made, and by location), as well as how much admissions forecasts improved when future cases were known. ResultsAll models outperformed the baseline in the majority of scenarios. Forecasting accuracy varied by forecast date and location, depending on the trajectory of the outbreak, and all individual models had instances where they were the top- or bottom-ranked model. Forecasts produced by the mean-ensemble were both the most accurate and most consistently accurate forecasts amongst all the models considered. Forecasting accuracy was improved when using future observed, rather than forecast, cases, especially at longer forecast horizons. ConclusionsAssuming no change in current admissions is rarely better than including at least a trend. Using confirmed COVID-19 cases as a predictor can improve admissions forecasts in some scenarios, but this is variable and depends on the ability to make consistently good case forecasts. However, ensemble forecasts can make forecasts that make consistently more accurate forecasts across time and locations. Given minimal requirements on data and computation, our admissions forecasting ensemble could be used to anticipate healthcare needs in future epidemic or pandemic settings.
RESUMO
Short-term probabilistic forecasts of the trajectory of the COVID-19 pandemic in the United States have served as a visible and important communication channel between the scientific modeling community and both the general public and decision-makers. Forecasting models provide specific, quantitative, and evaluable predictions that inform short-term decisions such as healthcare staffing needs, school closures, and allocation of medical supplies. Starting in April 2020, the US COVID-19 Forecast Hub (https://covid19forecasthub.org/) collected, disseminated, and synthesized tens of millions of specific predictions from more than 90 different academic, industry, and independent research groups. A multi-model ensemble forecast that combined predictions from dozens of different research groups every week provided the most consistently accurate probabilistic forecasts of incident deaths due to COVID-19 at the state and national level from April 2020 through October 2021. The performance of 27 individual models that submitted complete forecasts of COVID-19 deaths consistently throughout this year showed high variability in forecast skill across time, geospatial units, and forecast horizons. Two-thirds of the models evaluated showed better accuracy than a naive baseline model. Forecast accuracy degraded as models made predictions further into the future, with probabilistic error at a 20-week horizon 3-5 times larger than when predicting at a 1-week horizon. This project underscores the role that collaboration and active coordination between governmental public health agencies, academic modeling teams, and industry partners can play in developing modern modeling capabilities to support local, state, and federal response to outbreaks. Significance StatementThis paper compares the probabilistic accuracy of short-term forecasts of reported deaths due to COVID-19 during the first year and a half of the pandemic in the US. Results show high variation in accuracy between and within stand-alone models, and more consistent accuracy from an ensemble model that combined forecasts from all eligible models. This demonstrates that an ensemble model provided a reliable and comparatively accurate means of forecasting deaths during the COVID-19 pandemic that exceeded the performance of all of the models that contributed to it. This work strengthens the evidence base for synthesizing multiple models to support public health action.
RESUMO
Slovakia conducted multiple rounds of population-wide rapid antigen testing for SARS-CoV-2 in late 2020, combined with a period of additional contact restrictions. Observed prevalence decreased by 58% (95% CI: 57-58%) within one week in the 45 counties that were subject to two rounds of mass testing, an estimate that remained robust when adjusting for multiple potential confounders. Adjusting for epidemic growth of 4.4% (1.1-6.9%) per day preceding the mass testing campaign, the estimated decrease in prevalence compared to a scenario of unmitigated growth was 70% (67-73%). Modelling suggests that this decrease cannot be explained solely by infection control measures, but requires the additional impact of isolation as well as quarantine of household members of those testing positive.
RESUMO
The time-varying reproduction number (Rt: the average number secondary infections caused by each infected person) may be used to assess changes in transmission potential during an epidemic. While new infections are not usually observed directly, they can be estimated from data. However, data may be delayed and potentially biased. We investigated the sensitivity of Rt estimates to different data sources representing Covid-19 in England, and we explored how this sensitivity could track epidemic dynamics in population sub-groups. We sourced public data on test-positive cases, hospital admissions, and deaths with confirmed Covid-19 in seven regions of England over March through August 2020. We estimated Rt using a model that mapped unobserved infections to each data source. We then compared differences in Rt with the demographic and social context of surveillance data over time. Our estimates of transmission potential varied for each data source, with the relative inconsistency of estimates varying across regions and over time. Rt estimates based on hospital admissions and deaths were more spatio-temporally synchronous than when compared to estimates from all test-positives. We found these differences may be linked to biased representations of subpopulations in each data source. These included spatially clustered testing, and where outbreaks in hospitals, care homes, and young age groups reflected the link between age and severity of disease. We highlight that policy makers could better target interventions by considering the source populations of Rt estimates. Further work should clarify the best way to combine and interpret Rt estimates from different data sources based on the desired use.
RESUMO
BackgroundSchool closures are a well-established non-pharmaceutical intervention in the event of infectious disease outbreaks, and have been implemented in many countries across the world, including the UK, to slow down the spread of SARS-CoV-2. As governments begin to relax restrictions on public life there is a need to understand the potential impact that reopening schools may have on transmission. MethodsWe used data provided by the UK Department for Education to construct a network of English schools, connected through pairs of pupils resident at the same address. We used the network to evaluate the potential for transmission between schools, and for long range propagation across the network, under different reopening scenarios. ResultsAmongst the options evaluated we found that reopening only Reception, Year 1 and Year 6 (4-6 and 10-11 year olds) resulted in the lowest risk of transmission between schools, with outbreaks within a single school unlikely to result in outbreaks in adjacent schools in the network. The additional reopening of Years 10 and 12 (14-15 and 16-17 year olds) resulted in an increase in the risk of transmission between schools comparable to reopening all primary school years (4-11 year olds). However, the majority of schools presented low risk of initiating widespread transmission through the school system. Reopening all secondary school years (11-18 year olds) resulted in large potential outbreak clusters putting up to 50% of households connected to schools at risk of infection if sustained transmission within schools was possible. ConclusionsReopening secondary school years is likely to have a greater impact on community transmission than reopening primary schools in England. Keeping transmission within schools limited is essential for reducing the risk of large outbreaks amongst school-aged children and their household members.
RESUMO
Estimation of the effective reproductive number, Rt, is important for detecting changes in disease transmission over time. During the COVID-19 pandemic, policymakers and public health officials are using Rt to assess the effectiveness of interventions and to inform policy. However, estimation of Rt from available data presents several challenges, with critical implications for the interpretation of the course of the pandemic. The purpose of this document is to summarize these challenges, illustrate them with examples from synthetic data, and, where possible, make recommendations. For near real-time estimation of Rt, we recommend the approach of Cori et al. (2013), which uses data from before time t and empirical estimates of the distribution of time between infections. Methods that require data from after time t, such as Wallinga and Teunis (2004), are conceptually and methodologically less suited for near real-time estimation, but may be appropriate for retrospective analyses of how individuals infected at different time points contributed to spread. We advise against using methods derived from Bettencourt and Ribeiro (2008), as the resulting Rt estimates may be biased if the underlying structural assumptions are not met. Two key challenges common to all approaches are accurate specification of the generation interval and reconstruction of the time series of new infections from observations occurring long after the moment of transmission. Naive approaches for dealing with observation delays, such as subtracting delays sampled from a distribution, can introduce bias. We provide suggestions for how to mitigate this and other technical challenges and highlight open problems in Rt estimation. Author summaryThe effective reproductive number, Rt, is a key epidemic parameter used to assess whether an epidemic is growing, shrinking or holding steady. Rt estimates can be used as a near real-time indicator of epidemic growth or to assess the effectiveness of interventions. But due to delays between infection and case observation, estimating Rt in near real-time, and correctly inferring the timing of changes in Rt is challenging. Here, we provide an overview of challenges and best practices for accurate, timely Rt estimation.