knitr
knitr copied to clipboard
customizable cache (closes #2176)
This PR allows implementing knit_cache_hook
methods which may preprocess objects (e.g., save to an external file) and define custom loaders.
I will add a NEWS item after we agree with the design.
- refactor(cache): use saveRDS/readRDS instead of makeLazyLoadDB/lazyload
- For migration,
cache_save
replaces rdb/rdx files with rds file - For backward compatibility,
cache_load()
attemptslazyload()
if rdb/rdx files are available
- For migration,
- ~~feat(cache): allow pre/postprocessing cache objects)~~
-
knit_cache_preprocess
preprocesses objects being saved -
knit_cache_postprocess
postprocesses objects being loaded
-
- feat!(cache): implement knit_cache_hook instead of pre/post-processors
- Call
knit_cache_hook
methods on saving cache- Methods may save extra files under
${cache_path(h)}__extra
directory - Methods may return custom loader functions which is saved to
${cache_path(h).rds}
- Methods may save extra files under
- Call
With this PR, we can add some hooks on objects to be cached. For example, we can use writeLines to save character objects.
```{r}
library(knitr)
registerS3method(
"knit_cache_hook",
"character",
function(x, nm, path) {
# Cache x as is if it extends character class
if (!identical(class(x), "character")) {
return(x)
}
# Preprocess data (e.g., save data to an external file)
# Create external files under the directory of `paste0(path, "__extra")`
# if knitr should cleanup them on refreshing/cleaning cache
d <- paste0(path, "__extra")
dir.create(d, showWarnings = FALSE, recursive = TRUE)
f <- file.path(d, paste0(nm, '.txt'))
writeLines(x, f)
# Return loader function
# which receives ellipsis for future extentions and has knit_cache_loader class
structure(function(...) readLines(f), class = 'knit_cache_loader')
},
envir = asNamespace("knitr")
)
```
```{r, cache=TRUE}
x <- 'foo bar'
print(x)
```
```{r}
print(x)
```
maybe preprocess and postprocess are not good names... :thinking:
I got to fix tests
To solve the above problems, I implemented the knit_cache_hook
generic function in place of knit_cache_preprocess
and knit_cache_postprocess
. See updated description for the details.
Thanks for the comment. I do not have a strong opinion, but let me leave some comments below.
I accepted the complexity for following reasons:
- the feature is mainly for package developers and not for end users
- usage is limited (I guess)
With my implementation, user's do not have to care about what is going on under saving/loading caches.
For developers, I agree chunk option is a good idea. The implementation becomes simple. However, this imposes end-users to understand tricks for edge-cases. Can we expect end-users read documents carefully before facing troubles on cache behavior?
Good points, and I agree. Let me think more about it. Thanks!