cache
cache copied to clipboard
Cache saving is best-effort, restoring cache is not
I stumled upon this bug while using self-hosted runners.
Background
Cache version is derived with the compression tool in mind. So an identical file can be compressed with zstd
or gzip
, and that would yield two different cache entries (despite the file being identical!). This is expected and well-documented, but:
When you save the cache, on linux-x86_64, the action optimistically chooses zstd
as the compression method. If it can't find zstd
on the host, it downgrades to gzip
. If it can't find gzip
too, it fails to save the cache.
When you restore the cache, on linux-x86_64, the action optimistically expects zstd
as the compression method. If it can't find zstd
on the host, it downgrades to gzip
. If it can't find gzip
too, it fails to restore the cache.
The issue
Let's say we have two runners: Saver and Restorer. Saver only has gzip
, so it caches the file with gzip
. Restorer has both zstd
and gzip
, BUT it fails to restore the cache, because, seeing that zstd
is available, it expects the cache to also be zstd
-compressed. Seeing that there is no zstd
-compressed cache (because Saver doesn't have zstd
), it doesn't even try to check if there's gzip
-compressed cache and says there is no cache at all.
Seeing that not all Linux distributions provide zstd
out of the box (especially older ones), this little caveat can take a lot of manhours to debug: it really is not trivial to trace.
Solutions
- Allow specifying the compression method when calling the Action, default to
zstd
orgzip
ONLY - When restoring the cache, check if there are caches compressed with other methods too; don't just fail if there's no
zstd
cache - Decouple the "compressed with" metadata from the cache version: that way, the action will be able to see that there is matching cache, and it will see the metadata that the cache encrypted with
gzip
, so it will only fail ifgzip
is not installed; this will also provide a very nice and clear error message for debugging
This issue is stale because it has been open for 200 days with no activity. Leave a comment to avoid closing this issue in 5 days.
This is still the case.
Hey @KFearsoff , just wanted to say thanks for doing the leg work for this and writing up such an informative bug report. I remember reading this last year, back when I was skimming through tickets prior to planning our teams' migration to GitHub Actions, and recall thinking:
Wow, that's bizarre behavior! Also sound's like something that would have driven me nuts to debug.
Well, just yesterday I found I couldn't restore caches between jobs using container and non-container runners. The fact that I could see the paths and keys matching superficially, from the actions cache web-UI on github, made it all seem super inconsistent until I recalled your ticket, and while my minimal container runner included tar
and gzip
, it probably didn't have zstd
installed like the default runs-on: ubuntu-latest
does. And wallah, one little apt install later via a Dockerfile and all was well:
RUN apt-get install -y zstd
so it will only fail if gzip is not installed; this will also provide a very nice and clear error message for debugging
Yes! Better error handling, transparency, and user feedback on why caches fail to restore would be so much appreciated.
Related:
- https://github.com/actions/cache/issues/1300#issuecomment-2067807941
Thanks for this write-up, I just hit the same issue and the error was useless. Following @ruffsl's workaround resolved it for me.