actions/cache/save does not fail the step on error
actions/cache/save recently (2025-03-26) failed with a timeout and a subsequent error as it looks like the key has been created in the initial attempt:
Attempt 1 of 5 failed with error: Request timeout: /twirp/github.actions.results.api.v1.CacheService/CreateCacheEntry. Retrying request in 3000 ms...
Failed to save: Unable to reserve cache with key Linux-build-workspace-930, another job may be creating this cache.
Warning: Cache save failed.
The action is configured as follows:
- name: Cache workspace
uses: actions/cache/save@v4
env:
cache-name: persist-workspace
with:
path: |
...
key: ${{ runner.os }}-build-workspace-${{ github.run_number }}
Current behavior: the step's conclusion is success:
{
"name": "Persist build workspace",
"status": "completed",
"conclusion": "success",
"number": 19,
"started_at": "2025-03-26T02:17:23Z",
"completed_at": "2025-03-26T02:17:40Z"
}
Expected behavior: the action (and job) fails if the cache cannot be created. If this is not generally wanted, having a parameter to fail the step/job on cache creation would be highly welcomed.
Please note: this is a sporadic error which we cannot reproduce. We have not noticed such a behavior before the Deprecation Notice - Upgrade to latest before February 1st 2025.
We have a similar problem in https://github.com/kyamagu/skia-python/pull/313 - cache creation failed, in a span of 3 days spanning 1st April. It was successful only a few days ago, with no code change (justing updating readmes and internal version number for release).
We still run from time to time into this issue. Would be great if someone would take care of this. 🙏
We just had a big outage in production due to this.
And to make matters worse, it soft-locked the cache key, preventing any further writes to it. This is another screenshot taken 12h after that first one:
It basically made a soft-lock on the cache key, making it impossible to write to it.
cc'ing @nebuk89 @vsvipul @joshmgross @kotewar
It basically made a soft-lock on the cache key, making it impossible to write to it.
We also noticed that the key gets locked. We have a cleanup action for some caches which - in this scenario - then fails to cleanup the cache as no cache is found. Still the key remains locked. I would expect that, if storing the cache fails, also the key gets cleaned up.