cache icon indicating copy to clipboard operation
cache copied to clipboard

How to do fine-grained caching: bulk APIs?

Open huonw opened this issue 2 years ago • 8 comments

We're trying to turbo-charge our builds via fine-grained caching with the Pants build system. Pants recently gained experimental support for using the GitHub Actions Cache as a fine-grained "remote cache", to see the benefits discussed in https://dev.to/benjyw/better-cicd-caching-with-new-gen-build-systems-3aem, where we can reuse test runs and build artefacts from previous runs, while only downloading exactly what's required.

However, we find it doesn't work well in practice for us, even on a moderate sized repository, because doing fine-grained caching quickly hits rate limits (having to upload and/or download thousands of small "files" via individual requests). https://github.com/pantsbuild/pants/issues/20133

Are there any bulk APIs or other recommendations for how to best do the following:

  1. Check whether several cache entries exist
  2. Upload several new cache entries
  3. Download several cache entries

Alternatively, some other way to use the cache for many small requests.

This might benefit more than just Pants, e.g. https://github.com/mozilla/sccache also has a GHA cache backend, but hits some errors like this (https://github.com/mozilla/sccache/issues/1485).

(I asked this question of support (#2409822), and they told me to ask here instead, even though it's not directly related to the code in this repo.)

Thanks!

huonw avatar Nov 15 '23 00:11 huonw

This issue is stale because it has been open for 200 days with no activity. Leave a comment to avoid closing this issue in 5 days.

github-actions[bot] avatar Jun 02 '24 08:06 github-actions[bot]

Pants still sees users affected by this, e.g.: https://chat.pantsbuild.org/t/18821099/for-the-experimental-gha-remote-caching-new-in-2-20-https-ww#97b93a14-6ea3-4234-9ecb-35a68e1a70f2

huonw avatar Jun 02 '24 10:06 huonw

Looking forward to see the improvements as I'm also planning to adopt the fine-grained GHA cache for pantsbuild, as we are continuously hitting the cache size limit (10 GB per repo) and LFS caching gets degraded too often and quickly, resulting in the additional costs on the data packs.

achimnol avatar Jul 14 '24 04:07 achimnol

We are also interested in this for Bazel implementations.

thesayyn avatar Aug 13 '24 23:08 thesayyn

I was hoping to use GHA cache as a backend for the new Go 1.24 GOCACHEPROG feature. Such implementation will face the same issues and limitations. Looks like having a fine-grained caching API would benefit all of us.

kop avatar Feb 12 '25 20:02 kop

Why there is no interest from Github to actually make the platform better. There is definitely many build systems out there that could use this feature.

thesayyn avatar Mar 03 '25 21:03 thesayyn

Interested in this for https://github.com/bazel-contrib/setup-bazel/issues/18

p0deje avatar Mar 16 '25 18:03 p0deje

Can we get some attention here?

thesayyn avatar Sep 16 '25 18:09 thesayyn