wpt icon indicating copy to clipboard operation
wpt copied to clipboard

"Update Wasm Tests" failing frequently

Open gsnedders opened this issue 6 months ago • 2 comments

Across the last 90 days, the "Update Wasm Tests" workflow has been one of the top failures: https://github.com/web-platform-tests/wpt/actions/metrics/performance?dateRangeType=DATE_RANGE_TYPE_CUSTOM&filters=workflow_file_name%3Aupdate-wasm-tests.yml&range=1740960000000-1748736000000

This shows a 21% failure rate.

https://github.com/web-platform-tests/wpt/actions/workflows/update-wasm-tests.yml?query=is%3Afailure shows this isn't very often, because it's only running weekly, but it's still a high percentage of jobs — and the latest success was only because I manually re-ran it.

As far as I can tell, this failure happens whenever it gets a cache miss from GitHub Actions — and this isn't super surprising when caches are evicted after a week, and if the scheduling ends up with just long enough, it can be long enough it's been evicted.

We seem to typically hit 502 or 504 errors when downloading packages:

OpamSolution.Fetch_fail("https://gitlab.inria.fr/fpottier/menhir/-/archive/20240715/archive.tar.gz (curl: code 502 while downloading https://gitlab.inria.fr/fpottier/menhir/-/archive/20240715/archive.tar.gz)")
OpamSolution.Fetch_fail("http://download.camlcity.org/download/findlib-1.9.5.tar.gz (curl: code 504 while downloading http://download.camlcity.org/download/findlib-1.9.5.tar.gz)")

I wonder if there's some GitHub Actions proxy that isn't playing nice?

gsnedders avatar Jun 02 '25 23:06 gsnedders

The job failures are very infrequent (months apart), but I did notice that the gateway issue is not as transient as I assumed last week. I don't know where to look for that.

past avatar Jun 03 '25 17:06 past

I do wonder to what extent just making it run twice weekly (and thus typically have a valid cache) would make the problem largely go away.

gsnedders avatar Jun 03 '25 19:06 gsnedders