EpiNow2 icon indicating copy to clipboard operation
EpiNow2 copied to clipboard

option for secondary baseline

Open sbfnk opened this issue 3 years ago • 5 comments

Adds a baseline level to the secondary model, i.e. a constant component that is not affected by the primary data stream (somewhat akin to the endemic component in hhh4. This was motivated by observations that the model currently seems quite a long way off the data in some countries (though not all) in the European Forecast Hub submissions. When inspecting some of the raw data on cases and admissions, the relationship seemed reasonably close to linear but with an intercept at >0.

Visually, the updated model connects somewhat better to recent data (if not impressively so, possibly because still fitting to a mixture of Delta and Omicron), but it comes at the expense of less straightforward epidemiological interpretation of the estimated delay and scaling - happy to hear of other suggestions that could improve things.

From https://github.com/sophiemeakin/ecdc-covid19-admissions/blob/main/main.R

Without baseline: image

With baseline: image

sbfnk avatar Feb 01 '22 19:02 sbfnk

Tagging @sophiemeakin

sbfnk avatar Feb 01 '22 19:02 sbfnk

This seems like a good temporary fix and the code changes seem fine - but I wonder why this is happening? Some thoughts: My first q would be is the misalignment between observed and forecast admissions due to the fit of the secondary model, or the case forecasts? If the former - do we have an idea why is this happening, and so whether it's a temporary or permanent issue? If it's only a temporary change (e.g. change in variant distributions, reporting etc. etc.), then we should look at how adding the baseline will affect other locations where this temporary change isn't present (e.g. the UK forecast looks better without the baseline than with it). If the latter (although I suspect it isn't as we are using the hub-ensemble case forecast), then I personally wouldn't include the baseline in the model - the issue is with the case forecasts and we should think about how we can correct those instead.

sophiemeakin avatar Feb 02 '22 08:02 sophiemeakin

This seems like a good temporary fix and the code changes seem fine - but I wonder why this is happening? Some thoughts: My first q would be is the misalignment between observed and forecast admissions due to the fit of the secondary model, or the case forecasts? If the former - do we have an idea why is this happening, and so whether it's a temporary or permanent issue?

The fits of the secondary model to past data look poor so I think it's a combination of data and variants that break the underlying assumption of linear scaling over a 12-week period, even before case forecasts come in to complicate things further.

As an illustration, here's cases and 1-week shifted admissions, and a linear fit with setting intercept at 0 vs. leaving it free.

image

from this added to the end of main.R

shifted <- raw_dat %>%
  filter(week >= max(week) - 12 * 7, !is.na(adm)) %>%
  group_by(location, location_name) %>%
  mutate(adm = c(NA_real_, adm[-length(adm)]))

ggplot(shifted, aes(x = cases, y = adm)) +
  geom_point() +
  theme_bw() +
  facet_wrap(~ location_name, scales = "free") +
  expand_limits(y = 0) +
  scale_x_continuous("Cases", labels = scales::comma) +
  scale_y_continuous("Admissions", labels = scales::comma) +
  geom_smooth(method = lm, formula = y ~ 0 + x, colour = "blue", fill = "blue", alpha = 0.25) +
  geom_smooth(method = lm, formula = y ~ x, colour = "red", fill = "red", alpha = 0.25)

sbfnk avatar Feb 02 '22 11:02 sbfnk

Are you still interested in having this merged?

seabbs avatar Aug 31 '22 09:08 seabbs

Possibly but it needs to be revisited and your bullet points addressed. I've converted it to a draft for now.

sbfnk avatar Aug 31 '22 09:08 sbfnk

I'm not convinced this is still needed. Closing for now.

sbfnk avatar Apr 28 '23 11:04 sbfnk