pak icon indicating copy to clipboard operation
pak copied to clipboard

How to add (manually) a binary package (in the pkg cache) so that pak installs it

Open kforner opened this issue 2 years ago • 3 comments

First let me explain my use case: I have many packages to install (from a package list), repetitively, for building docker images. This list may vary a bit in time.

To save some time, energy and CO2, I was thinking doing something like:

  • download all deps tree of my package list in a pkgcache
  • build binary packages (once for all) of source packages (for BioC packages since I'm using RSPM binary packages for CRAN)
  • put those binary packages in the pkgcache in a way that they are seen and preferred by pak for installation
  • I will persist the pkgcache, e.g. on S3

The problem is that I do not know:

  • how to add the binary packages in the cache
  • also how to distinguish source packages from binary packages in the cache ? cache_list() reports binary packages with platform=='source'

Does this make sense ? and them how to add my binary packages ?

Thanks

kforner avatar Feb 04 '22 17:02 kforner

One thought:

I will persist the pkgcache, e.g. on S3

If you're really building binaries that will be used in consistent docker containers with consistent sets of packages, then you could skip all that, and instead just tar up the resulting library directory, then the subsequent containers just need to download that tar and unpack it.

The "elegance" of the binary availability is due to its extensibility, ability to combine with other repos, grab more things over time, etc. If you really just want to get the exact same set of packages each time (and are willing to update a new cache at the remote) you can basically skip the intermediate step

dpastoor avatar Feb 04 '22 17:02 dpastoor

@dpastoor Thanks for this simple and efficient idea. I had a similar implementation using a local docker build RUN-level cache, that I would then sync to the final R lib. That allowed to manage errors in installation.

The point of having a repository of binary packages would also:

  • allow to gracefully handle missing binary packages (pak would then fetch them online)
  • allow to have a super set of packages that could be installed later, by the users, in a efficient way
  • even if platform-specific, I suppose a binary tarball is probably more stable in time than an installed package folder

Anyway, I'm still interested in knowing how to add a binary package in pak pkgcache in a way that it is recognized and used in preference of the source package if any.

kforner avatar Feb 04 '22 17:02 kforner

While this can be useful, maybe for your actual use case you are better off with putting the packages you build in a CRAN-like file:/// repository?

gaborcsardi avatar Feb 04 '22 20:02 gaborcsardi

Yes, I think a local repository would be better for this use case, so I am going to close this issue.

gaborcsardi avatar Nov 01 '23 13:11 gaborcsardi