cache icon indicating copy to clipboard operation
cache copied to clipboard

Skip download option

Open DaSchTour opened this issue 3 years ago • 4 comments

I have some jobs that do actions if cache there is a cache hit or not or add stuff to the cache, but they actually do not care about the content of the cache. I think it would be convenient to have an option not download the content of the cache.

DaSchTour avatar Aug 19 '22 11:08 DaSchTour

I would like this as well.

bogdandrutu avatar Sep 01 '22 16:09 bogdandrutu

I am running into an issue that might just be related to this. Its unclear to me if I can just explicit set the cache.

The docs say key is required :/ In my case, I want to set the cache and not try to download it. Perhaps it would be nice if key became optional and was used to flag explicit set behavior?

setupJob:
    steps:
    - uses: actions/checkout@v2

    - name: Restore Workspace if its there
      id: try-restore
      uses: actions/cache@v3
      with:
        path: ${{ github.workspace }}
        key: restore-${{ hashFiles('package-lock.json', 'Makefile') }}
      
  ... (other logic) ...
      
    - name: Cache Workspace
      if: steps.try-restore.outputs.cache-hit != 'true'
      uses: actions/cache@v3
      with:
        path: ${{ github.workspace }}
        restore-keys: |
          restore-${{ hashFiles('package-lock.json', 'Makefile', 'src/', 'infra/') }}
          other-job-can-get-my-workspace-with-const-key

other job

otherjob:
    steps:
    - name: Get working directory 
      uses: actions/cache@v3
      with:
        path: ${{ github.workspace }}
        key: other-job-can-get-my-workspace-with-const-key

Wambosa avatar Sep 01 '22 19:09 Wambosa

@DaSchTour @bogdandrutu not sure I understand the rationale here of just checking existence of cache but not use it. Could you please share some use cases. Please note that now we have GitHub REST API and gh CLI extension to get a cache metadata by key. You can use these for such cases instead.

bishal-pdMSFT avatar Sep 11 '22 15:09 bishal-pdMSFT

@bishal-pdMSFT well it's not really just existence, it's also to "initialize" the cache if needed. The workflow simplfied looks like this.

  1. action/cache
  2. do something if cache is not hit adding stuff to the cache

If there is a cache hit I skip the second step, if not I add something to the cache. So if the cache is hit I don't care what's actually in the cache. But if cache is not hit I need to have it to be able to put something in there.

DaSchTour avatar Sep 12 '22 08:09 DaSchTour

Could you please share some use cases.

To expand one what has already been said. In some the workflows I've seen, the dependency install and tests happen in separate jobs. I.e.

  • Job 1: Check if cache exists, if not download / build dependencies and create new cache entry
  • Job 2 - X: Different test runners. Restore environment from cache. If restore failed, fail test. Run tests, linters, etc.

If for Job 1 a useable cache already exists, there is no point in spending the extra time restoring it as it won't get used anyway.

Please note that now we have GitHub REST API and gh CLI extension to get a cache metadata by key. You can use these for such cases instead.

True, although that's probably more complicate. It isn't enough to check if a specific key exists, due to the way restore works it must be from the same or base ref. The action does this check already. I imagine it's much easier to add a simple switch to skip restore.

-- Note: There is also #831 proposing something similar.

cdce8p avatar Oct 23 '22 17:10 cdce8p

We have the same use case, since we have a large repo, we want to run tests in multiple jobs, hence we cache all go dependencies initially then run tests in separate jobs.

bogdandrutu avatar Oct 23 '22 17:10 bogdandrutu

Hey all, 👋🏽

We have created a discussion with a proposal that we feel will solve this problem, do let us know your feedback, thanks 😊

kotewar avatar Dec 07 '22 12:12 kotewar

@kotewar unfortunately that discussion completely ignores this use case :)

bogdandrutu avatar Dec 08 '22 20:12 bogdandrutu

We have the same use case, since we have a large repo, we want to run tests in multiple jobs, hence we cache all go dependencies initially then run tests in separate jobs.

@bogdandrutu, yes we do this same exact thing. For reference to everyone else here. It looks something like this: image

The very first job is responsible for loading all the deps into the cache.

This currently works, however it is somewhat wasteful and could be better if we could control explicit set on cache-key invalidation.

Wambosa avatar Dec 08 '22 21:12 Wambosa

I've implemented a possible solution with #1041. A dry-run: true option would work well from my initial testing. If someone else wants to test it

      - uses: cdce8p/cache@restore-dry-run
        with:
          ...
          dry-run: true

cdce8p avatar Dec 23 '22 18:12 cdce8p