Add caching for HTTP resources
As @sbfnk mentioned, it would be useful (and polite) to cache surveys downloaded from zenodo instead of re-downloading them each time the code is re-run.
The simplest option is probably to use memoise but there are also other tools specific to http resources (e.g., https://github.com/sckott/webmiddens) so open to discussion / suggestions
I now wonder if this is worth the overhead given that one can just do (from the vignette)
peru_survey <- get_survey("https://doi.org/10.5281/zenodo.1095664")
saveRDS(peru_survey, "peru.rds")
and later future
peru_survey <- readRDS("peru.rds")
or alternatively via #61
peru_files <- download_survey("https://doi.org/10.5281/zenodo.1095664", dir = "Surveys")
peru_survey <- load_survey(peru_files)
saveRDS(peru_files, file.path("Surveys", "peru_files.rds"))
and later
peru_files <- readRDS(file.path("Surveys", "peru_files.rds"))
peru_survey <- load_survey(peru_files)
which also enables inspection/use of the raw csv files in "Surveys".
I think it depends on the position you're taking:
-
from the user point of view, you're entirely right, they could "manually cache" the results if they wish.
-
from the server point of view (zenodo.org), we're hitting them with unnecessary requests to get the same result over and over. It would be more polite to cache repeated requests. I believe it's especially important in this case because we're using webscraping and not an official API, which is usually better set up to handle automated requests.