setup-ruby icon indicating copy to clipboard operation
setup-ruby copied to clipboard

Provide ability to restore from cache but NOT save cache

Open btrautmann opened this issue 3 months ago • 5 comments

Hello 👋

I am hoping to discuss whether you'd consider adding inputs that would allow a caller to indicate that they'd like to restore bundler cache but NOT save it at the end of the workflow.

Rationale: In some scenarios, it does not make sense to save the cache. One such scenario is when PRs are running in the merge queue. The branch associated with the PRs being checked looks like gh-readonly-queue/main/pr-15190-f5aefc512c3719594eb22d601d2824ad519d07a0. This is scoped to a PR and a commit SHA. Due to the way cache restoration via actions/cache works, this cache will only be used by subsequent runs on the same PR and the same sha only in the queue (which is extremely unlikely, if not impossible--I'm not sure if that SHA is the result of several PRs being combined or what). Anyway, I'd love to avoid adding a bunch of these useless cached objects to the cache and increasing the likelihood of cache thrashing, if at all possible.

Request: I am wondering if it would be as simple as adding a parameter save-cache: true|false to the action:

- name: Setup Ruby and Bundler
  if: ${{ inputs.install_ruby == 'true'}}
  uses: ruby/setup-ruby@44511735964dcb71245e7e55f72539531f7bc0eb # v1.257.0
  with:
    ruby-version: ${{ steps.grab-versions.outputs.ruby_version }}
    bundler-cache: true
    working-directory: ${{ inputs.bundler_working_directory }}
    # Don't cache in the merge queue, it won't be used!
    save-cache: ${{ github.event_name != "merge_group" }}

For ultimate clarity, parameters restore-bundler-cache and save-bundler-cache would be better.

This is akin to the approach you can take when using actions/cache/restore and actions/cache/save directly. We do this in other places within our workflows.

Here's an example screenshot of a cache item in our cache that is using 100MB of space that'll never be used. Depending on the throughput of your repo, this may be significant if you have many PRs:

Image

btrautmann avatar Sep 17 '25 20:09 btrautmann

After doing some research, it may be worthwhile for me to look into whether the queue is even getting cache hits. If they're not (which it sound like maybe they're not), it may make sense to just disable caching altogether in the queue.

btrautmann avatar Sep 17 '25 20:09 btrautmann

OK, I just looked at one of our jobs and DO see cache hits of the Flutter SDK, so I think this issue is still relevant.

Cache restored successfully
Cache restored from key: flutter-linux-stable-3.29.2-x64-c23637390482d4cf9598c3ce3f2be31aa7332daf
Run actions/cache@v4
  with:
    path: /home/runner/.pub-cache
    key: flutter-pub-linux-stable-3.29.2-x64-c23637390482d4cf9598c3ce3f2be31aa7332daf-bfc7f4aacbcf80ddd0adb01ff1c07d0b79b2781df10ca0f5d21e39ce6f77e63c
    enableCrossOsArchive: false
    fail-on-cache-miss: false
    lookup-only: false
    save-always: false
Cache hit for: flutter-pub-linux-stable-3.29.2-x64-c23637390482d4cf9598c3ce3f2be31aa7332daf-bfc7f4aacbcf80ddd0adb01ff1c07d0b79b2781df10ca0f5d21e39ce6f77e63c
Received 33554432 of 136985095 (24.5%), 32.0 MBs/sec
Received 136985095 of 136985095 (100.0%), 69.0 MBs/sec
Cache Size: ~131 MB (136985095 B)
/usr/bin/tar -xf /home/runner/work/_temp/5154d90d-ef9c-48e7-b5d1-663f0eb789a2/cache.tzst -P -C /home/runner/work/mobile/mobile --use-compress-program unzstd
Cache restored successfully
Cache restored from key: flutter-pub-linux-stable-3.29.2-x64-c23637390482d4cf9598c3ce3f2be31aa7332daf-bfc7f4aacbcf80ddd0adb01ff1c07d0b79b2781df10ca0f5d21e39ce6f77e63c

btrautmann avatar Sep 17 '25 20:09 btrautmann

GitHub Action cache by design is per branch, and cache enabled for any PR branch is also by design, so that if a user push multiple times to the same PR brach it will be able to use the cache. Space used for cache usually isn't a concern because GitHub will automatically purge older cache.

If you really don't like this behavior, you can just do bundler-cache: false and manually run bundle install, then you have full control yourself with actions/cache.

ntkme avatar Sep 18 '25 22:09 ntkme

GitHub Action cache by design is per branch, and cache enabled for any PR branch is also by design, so that if a user push multiple times to the same PR brach it will be able to use the cache. Space used for cache usually isn't a concern because GitHub will automatically purge older cache.

If you really don't like this behavior, you can just do bundler-cache: false and manually run bundle install, then you have full control yourself with actions/cache.

I don't think this really touches on the point I'm making. I understand how the caching works, and I understand why it's beneficial. The problem is that merge-queue branches are single-use, and therefore it does not make sense to persist to the cache from them.

Space used for cache usually isn't a concern because GitHub will automatically purge older cache.

This is exactly the problem--if your cache is being bloated by merge-queue runs (ours is), cached items will be purged sooner than necessary. There are times where our cache bloats to 20GB (I've seen it as high as 60!) due to the throughput on our repo. In other cases where we have control over the caching strategy (i.e. where we're caching ourselves via actions/cache/save, we do not save cache in the queue, and this has been helpful in alleviating cache bloat).

btrautmann avatar Sep 19 '25 14:09 btrautmann

In that case the best way to do it would be detect if a run is from merge queue, which can be don by check github.event_name, and then disable writing cache. PR would be welcome.

Also for now maybe you can do something like this as a workaround:

    - uses: ruby/setup-ruby@v1
      with:
        ruby-version: '3.4'
        bundler-cache: ${{ github.event_name != 'merge_group' }}
    - if: github.event_name == 'merge_group'
      run: bundle install

ntkme avatar Sep 19 '25 16:09 ntkme

I wonder if it might make sense to file a similar issue to https://github.com/actions/toolkit/, i.e. if it never makes sense to save cache for merge_group, then maybe cache.saveCache() should noop in that case.

eregon avatar Dec 20 '25 11:12 eregon