pak
pak copied to clipboard
Use renv cache if available
I'm not entirely sure what this involves, but if you use pak inside a renv-using project, it would be nice if the packages were installed into the renv global cache, and then symlinked using the using renv system. (So that pak::pak()
would be equivalent to install.packages()
inside renv projects.)
Yeah, I am not sure what this involves, either. Maybe some lower level API call in renv that lets us put a package in the cache? @kevinushey?
If I understand correctly, there's really two things that we want:
-
pak
should have a way of using the globalrenv
cache as a source / shortcut when installing a requested package; -
renv
should givepak
an API for copying packages installed in the current library into the global cache.
renv
needs the package DESCRIPTION file in order to figure out the cache key; that's usually straightforward for packages that are current on CRAN, or for packages on GitHub. That becomes more challenging for packages from the CRAN archive though. That said, for the first option I think we want something like:
renv:::renv_cache_find(<description>)
and if that path exists, pak
could use that package for installation rather than downloading and installing itself.
For option 2, renv
could have a function like:
renv:::renv_cache_synchronize(library, packages)
to copy some set of packages from the requested library to the cache.
I think it can be simpler. E.g.
- have a function that returns the hash of a package from its description (if that's all you use for hashing), and
- have a function that returns the location of the cache.
have a function that returns the hash of a package from its description (if that's all you use for hashing), and
renv:::renv_hash_description(<description path>)
have a function that returns the location of the cache.
renv:::renv_cache_path(<description path>)
I don't think I want to expose these as exported renv
functions, but perhaps there's some middle ground (e.g. as R functions; renv.hash.function
and renv.cache.path
or something?)
Do you have an opinion on what the right contract between pak
and renv
is here?
Do you have an opinion on what the right contract between pak and renv is here?
IDK, I would have to experiment with this a bit.
Btw. pak modifies DESCRIPTION after installation, so the hash of the installed package will be different. Is that OK?
To use the renv cache as a source, maybe it is better to query the root of the cache for the current platform and R version? And have some convention about enumerating the packages in the cache.
Btw. pak modifies DESCRIPTION after installation, so the hash of the installed package will be different. Is that OK?
This is probably okay, depending on what changes pak
makes. renv
uses a subset of the DESCRIPTION fields when building the hash. The implementation is relatively small and lives here:
https://github.com/rstudio/renv/blob/e6aff9f2dc847a80c8c9b6a666bb3f3825fc7c4d/R/hash.R#L10-L68
To use the renv cache as a source, maybe it is better to query the root of the cache for the current platform and R version? And have some convention about enumerating the packages in the cache.
renv_cache_list()
might be useful; e.g.
> renv:::renv_cache_list(packages = "rlang")
[1] "/Users/kevinushey/Library/Application Support/renv/cache/v5/macos/R-4.0/x86_64-apple-darwin17.0/rlang/0.4.10.9000/0624dce817c45fb4539360b206afd1e6/rlang"
[2] "/Users/kevinushey/Library/Application Support/renv/cache/v5/macos/R-4.0/x86_64-apple-darwin17.0/rlang/0.4.10.9000/5e85d0584690ab1a57900ec84ff1f3a6/rlang"
[3] "/Users/kevinushey/Library/Application Support/renv/cache/v5/macos/R-4.0/x86_64-apple-darwin17.0/rlang/0.4.10/599df23c40a4fce9c7b4764f28c37857/rlang"
[4] "/Users/kevinushey/Library/Application Support/renv/cache/v5/macos/R-4.0/x86_64-apple-darwin17.0/rlang/0.4.6/aa263e3ce17b177c49e0daade2ee3cdc/rlang"
[5] "/Users/kevinushey/Library/Application Support/renv/cache/v5/macos/R-4.0/x86_64-apple-darwin17.0/rlang/0.4.7/c06d2a6887f4b414f8e927afd9ee976a/rlang"
[6] "/Users/kevinushey/Library/Application Support/renv/cache/v5/macos/R-4.0/x86_64-apple-darwin17.0/rlang/0.4.8/843a6af51414bce7f8a8e372f11d6cd0/rlang"
[7] "/Users/kevinushey/Library/Application Support/renv/cache/v5/macos/R-4.0/x86_64-apple-darwin17.0/rlang/0.4.9/9d7aba7bed9a79e2403b4777428a2b12/rlang"
Came across this issue. Pak is nice because it is much faster than renv using its own install functions.
When working in an renv-activated project, with renv.config.pak.enabled=TRUE, pak 0.2.1 downloads source code into its own cache and installs the binary into the renv project.
The former is OK because I don't really mind having a source code cache for pak that is separate from the renv source code cache. The latter is a problem because I don't want package binaries replicated in each renv project. I want them installed in the renv binary cache and linked into the project.
I fixed this problem by writing a function that wraps renv::install() followed by
function () { renv::snapshot(prompt = FALSE) lib <- renv::paths$library() lock <- renv:::renv_lockfile_load(".") packages <- lock$Packages invisible(lapply(X = packages, FUN = function(x) renv:::renv_cache_synchronize(record = x, linkable = TRUE))) }
This goes through everything pak just installed, copies it to the renv cache and links it back into the project. Problem solved.
Could be useful for renv::install().