Sporadic cypress binary cache failure of example-basic-pnpm workflow on Ubuntu
Issue
The workflow .github/workflows/example-basic-pnpm.yml fails sporadically. See the following action logs:
When the workflow fails, the log section "Cypress tests" shows:
/usr/local/bin/npx cypress cache list
No cached binary versions were found.
/usr/local/bin/npx cypress verify
The cypress npm package is installed, but the Cypress binary is missing.
and the failures are on one of the jobs:
basic-pnpm-ubuntu-20basic-pnpm-ubuntu-22
| Failure date | ubuntu-20 | ubuntu-22 |
|---|---|---|
| May 6, 2024 (1649) | - | job 24636015853 |
| Apr 24, 2024 (1645) | job 24204489584 | - |
| Apr 19, 2024 (1641) | job 24018300090 | - |
It is not possible to analyze earlier failures, since the corresponding logs have been automatically purged by GitHub settings.
- See PR #1180 for possible resolution
The cache race condition which led to this issue is resolved in https://github.com/cypress-io/github-action/pull/1180. If however the cache of the Cypress binary cache and the cache of the pnpm store cache are out of sync, the workflow does not recover automatically. It is necessary in this situation to manually delete the cache of the pnpm store cache and re-run the workflow.
I'll leave this issue open for the moment and monitor if there are additional failures.
This issue is still happening.
See https://github.com/cypress-io/github-action/actions/runs/9186993814/job/25263739457
Using the failed workflow https://github.com/cypress-io/github-action/actions/runs/9186993814/job/25263739457 as an example ...
The problem occurs when the pnpm store is cached in Linux-pnpm-store-* and the pnpm store contains a copy of the correct version of Cypress, whereas at the same time the cypress-linux-x64-* is missing the corresponding cache Cypress binary.
There is no simple solution to coordinating these two caches. Manual caching of the pnpm store cache in the example workflows uses the GitHub JavaScript action actions/cache. The Cypress GitHub action uses the npm module @actions/cache to cache the Cypress binary cache.
When there are parallel jobs running there is no guarantee that a cache can be saved if the jobs are using the same cache key. One job may block the other and lead to inconsistency.
Package managers, including pnpm, only run the postinstall script when Cypress is not installed. Cypress has no automatic fallback to attempt to (re-)install the Cypress binary if it is missing. If pnpm thinks that Cypress is installed, because it was found in the pnpm store cache, then it doesn't attempt to install the Cypress binary. When it comes time to verify Cypress, the verification fails because the Cypress binary cache wasn't populated.
The short-term mitigation is to remove the Linux store caching from the documentation and example workflows. The longer-term solution would be to implement https://github.com/cypress-io/github-action/issues/1044 to cache pnpm dependencies directly under the control of the Cypress GitHub action.
- PR #1188 submitted to resolve this issue by removing pnpm store caching