EpiNow2
EpiNow2 copied to clipboard
option for secondary baseline
Adds a baseline level to the secondary model, i.e. a constant component that is not affected by the primary data stream (somewhat akin to the endemic component in hhh4
. This was motivated by observations that the model currently seems quite a long way off the data in some countries (though not all) in the European Forecast Hub submissions. When inspecting some of the raw data on cases and admissions, the relationship seemed reasonably close to linear but with an intercept at >0.
Visually, the updated model connects somewhat better to recent data (if not impressively so, possibly because still fitting to a mixture of Delta and Omicron), but it comes at the expense of less straightforward epidemiological interpretation of the estimated delay and scaling - happy to hear of other suggestions that could improve things.
From https://github.com/sophiemeakin/ecdc-covid19-admissions/blob/main/main.R
Without baseline:
With baseline:
Tagging @sophiemeakin
This seems like a good temporary fix and the code changes seem fine - but I wonder why this is happening? Some thoughts: My first q would be is the misalignment between observed and forecast admissions due to the fit of the secondary model, or the case forecasts? If the former - do we have an idea why is this happening, and so whether it's a temporary or permanent issue? If it's only a temporary change (e.g. change in variant distributions, reporting etc. etc.), then we should look at how adding the baseline will affect other locations where this temporary change isn't present (e.g. the UK forecast looks better without the baseline than with it). If the latter (although I suspect it isn't as we are using the hub-ensemble case forecast), then I personally wouldn't include the baseline in the model - the issue is with the case forecasts and we should think about how we can correct those instead.
This seems like a good temporary fix and the code changes seem fine - but I wonder why this is happening? Some thoughts: My first q would be is the misalignment between observed and forecast admissions due to the fit of the secondary model, or the case forecasts? If the former - do we have an idea why is this happening, and so whether it's a temporary or permanent issue?
The fits of the secondary model to past data look poor so I think it's a combination of data and variants that break the underlying assumption of linear scaling over a 12-week period, even before case forecasts come in to complicate things further.
As an illustration, here's cases and 1-week shifted admissions, and a linear fit with setting intercept at 0 vs. leaving it free.

from this added to the end of main.R
shifted <- raw_dat %>%
filter(week >= max(week) - 12 * 7, !is.na(adm)) %>%
group_by(location, location_name) %>%
mutate(adm = c(NA_real_, adm[-length(adm)]))
ggplot(shifted, aes(x = cases, y = adm)) +
geom_point() +
theme_bw() +
facet_wrap(~ location_name, scales = "free") +
expand_limits(y = 0) +
scale_x_continuous("Cases", labels = scales::comma) +
scale_y_continuous("Admissions", labels = scales::comma) +
geom_smooth(method = lm, formula = y ~ 0 + x, colour = "blue", fill = "blue", alpha = 0.25) +
geom_smooth(method = lm, formula = y ~ x, colour = "red", fill = "red", alpha = 0.25)
Are you still interested in having this merged?
Possibly but it needs to be revisited and your bullet points addressed. I've converted it to a draft for now.
I'm not convinced this is still needed. Closing for now.