gradle-build-action icon indicating copy to clipboard operation
gradle-build-action copied to clipboard

Provide a mechanism to deal with a corrupted cache entry

Open arouel opened this issue 3 years ago • 13 comments

In a previous job run, a corrupt Gradle distribution got cached. Subsequent job executions fail since they use the corrupt cache entry.

Could not unzip /home/runner/.gradle/wrapper/dists/gradle-7.2-all/260hg96vuh6ex27h9vo47iv4d/gradle-7.2-all.zip to /home/runner/.gradle/wrapper/dists/gradle-7.2-all/260hg96vuh6ex27h9vo47iv4d.
Reason: invalid stored block lengths
Exception in thread "main" java.util.zip.ZipException: invalid stored block lengths
	at java.base/java.util.zip.InflaterInputStream.read(InflaterInputStream.java:165)
	at java.base/java.io.FilterInputStream.read(FilterInputStream.java:107)
	at org.gradle.wrapper.Install.copyInputStream(Install.java:266)
	at org.gradle.wrapper.Install.unzip(Install.java:252)
	at org.gradle.wrapper.Install.access$900(Install.java:27)
	at org.gradle.wrapper.Install$1.call(Install.java:81)
	at org.gradle.wrapper.Install$1.call(Install.java:48)
	at org.gradle.wrapper.ExclusiveFileAccessManager.access(ExclusiveFileAccessManager.java:69)
	at org.gradle.wrapper.Install.createDist(Install.java:48)
	at org.gradle.wrapper.WrapperExecutor.execute(WrapperExecutor.java:107)
	at org.gradle.wrapper.GradleWrapperMain.main(GradleWrapperMain.java:63)
Error: Gradle build failed: process exited with status 1

As a developer, I need a way to work around a corrupt cache entry, either by an option to ignore the cache to be able to rebuilt it or by ignoring it during a failed execution.

arouel avatar Nov 02 '21 10:11 arouel

Hmmm. This is a good point: the cache key will never change for this entry, so you'll never recover from this situation.

Options you currently have are:

  • Disable the cache entirely for a week until the entry is purged naturally cache-disabled: true.
  • Add a custom cache-key-prefix (undocumented) so that all of the cache keys are changed. I use this mechanism for testing the action: you can set the prefix to any fixed value.

Ideally in this scenario the action would detect the invalid entry, ignore it, and automatically update with a correct value on save. Alternatively we could have a cache-write-only flag that would not use entries from the cache but would overwrite them. BUT: I'm pretty sure cache keys are write once, meaning there's no way to overwrite the bad entry!

I'll need to think about how to solve this properly. Having a ignored-cache-keys parameter is possible, but pretty ugly.....

bigdaz avatar Nov 02 '21 15:11 bigdaz

I've raised #116 as the quick solution to avoid failing the build in this case.

bigdaz avatar Nov 02 '21 15:11 bigdaz

So I've pushed a fix for #116, but looking at your failure more closely I can see this won't help in your case. This is because the cache entry was downloaded and unpacked successfully, but a wrapper zip file contained inside the cache entry could not be unpacked. This type of corrupt-content problem could manifest in myriad ways, and I can't think of a way to automatically detect it.

I've renamed this issue and flagged it as an enhancement (although I'm sure it feels like a bug when you hit this problem!).

bigdaz avatar Nov 05 '21 13:11 bigdaz

@bigdaz so we're still missing a retry logic or something similar to deal with corrupt cache entries as mentioned above?

arouel avatar Nov 15 '21 19:11 arouel

@bigdaz so we're still missing a retry logic or something similar to deal with corrupt cache entries as mentioned above?

Not exactly. If the cache entry is fine, but the content contains something bad in Gradle User Home, the build could/will fail in a bunch of different ways. I can't think of a way to detect this situation automatically, so any fix would require user intervention. It's just not clear to me the best lever to expose to users to do so.

bigdaz avatar Nov 16 '21 02:11 bigdaz

Is there a way to clear the cache? The action works, but the cache has something that provokes a failing build. If I don't use the cache with cache-disabled = true, the build works, but it looks like the previous cache is not completely deleted, so if I run again with cache, it fails.

I am not sure if running cache-disabled=true should delete the previous cache, if not, an easy way to clean up all caches without having to play with keys should be great.

Something like cache-deleted=true would delete all cache directories at the start of the run.

  • Run without cache: https://github.com/JavierSegoviaCordoba/compose-resources-kmp/actions/runs/1704228418
  • Run with cache after previous run with --rerun-tasks: https://github.com/JavierSegoviaCordoba/compose-resources-kmp/actions/runs/1704251415
  • Run with cache after previous run with clean: https://github.com/JavierSegoviaCordoba/compose-resources-kmp/actions/runs/1704259574

The latest two failed.

JavierSegoviaCordoba avatar Jan 16 '22 12:01 JavierSegoviaCordoba

Without cache won't (shouldn't) delete the cache. You can ask GitHub (through the web UI) to delete/clear the cache.

tbroyer avatar Jan 16 '22 16:01 tbroyer

@tbroyer yeah I am subscribed to a few issues (https://github.com/actions/cache/issues/632 and https://github.com/actions/cache/issues/2) but until they are implemented I need a way to delete the cache because, literally, I have a repo where I can only build things by disabling caches.

Another solution can be having a global default key that if it is changed, the whole cache is deleted, and set that key from a secret so it can be changed without pushing to the repository.

I agree that, hopefully, these workarounds should be unnecessary when GitHub implements it in the UI or via CLI.

JavierSegoviaCordoba avatar Jan 16 '22 16:01 JavierSegoviaCordoba

Currently, the only option you have from gradle-build-action is to specify a new "Cache Key Prefix" for the workflow, which will have the same effect as purging the entire cache.

You do this via an environment variable:

env:
  GRADLE_BUILD_ACTION_CACHE_KEY_PREFIX: custom-prefix

This is very brute-force, but there's no current mechanism to purge/ignore a single cache entry. Until GitHub adds more cache-management features I don't have a good plan to make this work.

bigdaz avatar Jan 16 '22 23:01 bigdaz

@tbroyer Can you point me to how to delete/clear the cache via the web UI? I didn't know this was possible.

bigdaz avatar Jan 16 '22 23:01 bigdaz

That option is enough for me, I will try, thank you 😀

About deleting the cache via UI, I think it is not possible yet based on the links I shared.

JavierSegoviaCordoba avatar Jan 17 '22 01:01 JavierSegoviaCordoba

I deleted the cache in this run https://github.com/JavierSegoviaCordoba/compose-resources-kmp/runs/4846053796, that run passes but for some reason, it still fails in newer runs https://github.com/JavierSegoviaCordoba/compose-resources-kmp/actions/runs/1709941710, locally it is working... I will try on another computer.

Curiously, there are a lot of jobs and each of them is failing with a different error. Really weird.

EDIT: tried in another computer and I can't reproduce it :/

JavierSegoviaCordoba avatar Jan 17 '22 21:01 JavierSegoviaCordoba

@tbroyer Can you point me to how to delete/clear the cache via the web UI? I didn't know this was possible.

Sorry, I misremembered (probably mixed it up with artifacts, and/or with other CI tools: you could clear the caches in Travis for instance, last time I used it at least)

tbroyer avatar Jan 18 '22 12:01 tbroyer