covid19hub
covid19hub copied to clipboard
Package: bed occupancy forecasting
Package: bed occupancy forecasting
Description
Package existing code to implement forecasting of daily bed occupancy from input on daily admissions and the distribution of duration of stay.
The package should include documentation, simple reproducible examples, unit tests.
Use cases / specs
- predict bed occupancy from existing admission data
- input dates of admissions as
Date
- input numbers of admissions as
integer
- input distribution of the length of stay (LoS) as a
distcrete
object (preferably) or as a function returninginteger
numbers along the lines ofrgeom()
- input dates of admissions as
- do the above, but with several admission trajectories (e.g. from different models)
- input on admissions as a
projections
object - input LoS as in 1.
- input on admissions as a
- outputs should be
projections
objects
Relevant related pacakges
- projections; note the current github version should be used
- distcrete
Starting point
A starting point is provided in the two following gists:
Impact
Some of the main questions for COVID19 response relate to hospital bed capacity. For this, multpile modelling groups are making forecastings of bed occupancy, e.g. our app. However there is no consistent way of doing this, leaving room for potential errors and consistency across results. Having a reliable, computer-efficient, well documented and tested way of forecasting bed occupancy will improve the quality of, and the ease for providing, these predictions.
Proposed timeline
- first viable product: as soon as possible; end of May?
- first stable version (CRAN release): less urgent, the main thing is to have this package, tested, documented, and publicly accessible
Focal point
@thibautjombart
If this is still required I've started packaging it up over at https://github.com/tjtnew/bedoc. A little hectic today but can probably get something reasonable finished by c.o.p tomorrow if it's still required.
Yes, this is still needed :) I'll remove my totally unrealistic deadline, but this is still very topical and timely. Many thanks for getting started on this!
A few questions:
- What are you trying to avoid with the sanity check
n_admissions[1] >= 1
. A user may want to illustrate the ramp up from zero cases over a particular interval. Perhaps a warning instead to highlight if they start with zero cases in case they didn't mean to? - Are you wanting both
simulate_occupancy
andbed_forecasts
to be exposed to the user? - If
simulate_occupancy
is supposed to be exposed to the user then the code restriction to only predictions dates <= the maximum provided admission date is a little confusing. The reason I say this is that "use case 1", "predict bed occupancy from existing admission data" would seem to make this redundant otherwise. Should we instead allow the user to specify the end date (but default to the maximum admission date). This point can be ignored ifsimulate_occupancy
is only an internal function to be used only bybed_forecasts
. Hope these q make sense!
I've thought some more in regards to the second and third bullets above. I think it would make sense to have simulate_occupancy
as an internal function and only export bed_forecasts
. The output of simulate_occupancy
can be recovered from bed_forecasts
by only inputting a single projection so I think this is the cleaner option. We can then also move the sanity checks in to bed_forecasts
and just test that functionality. If you agree I'll modify later and then start on a more detailed use case for a vignette.
Sorry about the slow response on this.
- What are you trying to avoid with the sanity check
n_admissions[1] >= 1
. A user may want to illustrate the ramp up from zero cases over a particular interval. Perhaps a warning instead to highlight if they start with zero cases in case they didn't mean to?
That's a mistake, having it's origin in a technicality, from memory: the function was throwing an error if there was a day of zero admissions due to using rep
(maybe?). At any rate, it should be all(n_admissions >= 1)
rather than the current test. Note that the corner case where all data in n_admissions
is 0 may have to be handled separately: in this case the model will not predict a single bed.
- Are you wanting both
simulate_occupancy
andbed_forecasts
to be exposed to the user?
I named these poorly. I think ideally this should perhaps be a generic with specific S3 methods for different classes of objects? And to be consistent with projections, maybe we could settle for project_beds()
. What do you think?
- If
simulate_occupancy
is supposed to be exposed to the user then the code restriction to only predictions dates <= the maximum provided admission date is a little confusing. The reason I say this is that "use case 1", "predict bed occupancy from existing admission data" would seem to make this redundant otherwise. Should we instead allow the user to specify the end date (but default to the maximum admission date).
Yes (to the last suggestion).
This point can be ignored if
simulate_occupancy
is only an internal function to be used only bybed_forecasts
. Hope these q make sense!
It does. Thank you so much for your help - very appreciated!
I've thought some more in regards to the second and third bullets above. I think it would make sense to have
simulate_occupancy
as an internal function and only exportbed_forecasts
. The output ofsimulate_occupancy
can be recovered frombed_forecasts
by only inputting a single projection so I think this is the cleaner option. We can then also move the sanity checks in tobed_forecasts
and just test that functionality. If you agree I'll modify later and then start on a more detailed use case for a vignette.
OK, it would make sense, but maybe a middle group, using S3 generic / method, would be to provid a method for incidence
objects too, so that users can:
- forecast bed occupancy from existing admission data (stored as
incidence
) - forecast bed occupancy from projected admissions (stored as
projections
)
An easy way to go about this would be adding a as_projections.incidence
method (or as(..., "projections")
) in projections
to convert incidence
objects to projections
. This is what is done effectively at:
https://gist.github.com/thibautjombart/7dacaee1e1ec3e5bf27e522d9312be76#file-simulate_occupancy-r-L87-L89
OK, it would make sense, but maybe a middle group, using S3 generic / method, would be to provid a method for
incidence
objects too, so that users can:* forecast bed occupancy from existing admission data (stored as `incidence`) * forecast bed occupancy from projected admissions (stored as `projections`)
That sounds good.
In regards to a vignette is any of the data/projections used in your paper available in the open? I'm guessing not, but it would make a nice example if they were.
Unfortunately no, no publicly available data for this specific example. But we can try to make a decent example using realistic epi parameters. I paste below code for a length-of-stay distribution publicly available, with some random admissions as incidence
:
## Data from across 23 countries with “many of the included cases from the
## UK”. 290 patients, of which 163 with recorded outcomes. 52 admitted to
## ICU. Data collected on patients with data collection commenced on or before
## 13 March 2020. Report issued 27 March 2020.
## https://media.tghn.org/medialibrary/2020/03/ISARIC_Data_Platform_COVID-19_Report_27MAR20.zip
## Mean hospital LOS to death or discharge (n=163) 5.3 days (SD 4.25)
## Roughly: median 5 days, QIR: 3-7
los_isaric <- distcrete::distcrete("weibull", shape = 2, scale = 6, w = 0.5, interval = 1)
## make toy admission data
library(incidence)
library(magrittr)
admissions <- sample(Sys.Date() - c(0,7), 40, replace = TRUE) %>%
incidence()
## project bed occupancy for current admissions
## project future admissions
[...]
## project bed occupancy for current + future admissions
[...]
Note that the missing bit, beyond the bed occupancy part, is a serial interval distribution (si
in projections::project
). This can be obtained by:
- getting published parameters e.g. from MIDAS
- converting, if needed, these parameters to build a distribution (things like
epitrix::gamma_mucv2shapescale
may help to convert parameters specified in the literature into something we can build a gamma with) - using
distcrete
to build a discretised distribution
I will ask what serial interval is currently used for UK data.
project_beds
is now a generic with methods for projections
and incidence
objects. I've changed the code a bit from your initial gists so it's probably worth you kicking the tyres a bit.
I've included a few simple integrations tests and we have good coverage save for the sanity checks which we could easily check although perhaps a little unnecessary.
CI is set up with GitHub Actions which checks the build on mac (release + devel), windows (release) and linux (release). It also builds the basic pkgdown site and integrates coverage with codecov. Currently these run as separate jobs so this could be streamlined somewhat.
From your last comment I take it you want to implement a predict_admissions
function. Is it worth me starting with a simple exponential growth method to begin with (the doubling approach in the app) and we can build on this going forward.
Can we move future discussion over to https://github.com/tjtnew/bedoc/issues as it will be good to keep the discussion tied to the repo. On a similar note let me know if you want to shift the repo over to Reconhub and for us continue development there.
Have a good weekend.
Impressive! Great, I will take the conversation there.
Package now at https://github.com/reconhub/occupancy