setup-emsdk
setup-emsdk copied to clipboard
Emscripten sanity check deletes cache during build
Hi and thanks for this action.
We are encoutering a weird issue, that is both hard to track and to reproduce:
Our use-case: We use setup-emsdk action to install and cache emscripten in a "pre-cache" job, and then restore and activate it from cache in a following job to avoid multiple downloads and installations with matrix builds. Our setup looks like follows:
env:
EMSCRIPTEN_VERSION: latest
EMSCRIPTEN_CACHE_FOLDER: emsdk-cache
<snip>
jobs:
setup_emscripten:
name: Set up and cache emscripten
runs-on: ubuntu-20.04
steps:
- name: Set up cache
uses: actions/cache@v2
id: cache
with:
path: ${{ env.EMSCRIPTEN_CACHE_FOLDER }}-${{ github.run_id }}
key: ${{ runner.os }}-emsdk-${{ env.EMSCRIPTEN_VERSION }}-${{ github.run_id }}
- name: Set up emsdk
uses: mymindstorm/setup-emsdk@v7
with:
version: ${{ env.EMSCRIPTEN_VERSION }}
actions-cache-folder: ${{ env.EMSCRIPTEN_CACHE_FOLDER }}-${{ github.run_id }}
no-cache: true
<snip>
build_js:
runs-on: ubuntu-20.04
needs: [setup_emscripten]
strategy:
matrix:
toolkit:
<4 build options>
steps:
- name: Checkout main repo
uses: actions/checkout@v2
- name: Restore cache
id: restore_cache
uses: actions/cache@v2
with:
path: ${{ env.EMSCRIPTEN_CACHE_FOLDER }}-${{ github.run_id }}
key: ${{ runner.os }}-emsdk-${{ env.EMSCRIPTEN_VERSION }}-${{ github.run_id }}
- name: Set up emsdk (cache not found)
uses: mymindstorm/setup-emsdk@v7
if: steps.restore_cache.outputs.cache-hit != 'true'
with:
version: ${{ env.EMSCRIPTEN_VERSION }}
no-cache: true
- name: Set up emsdk (cache found)
if: steps.restore_cache.outputs.cache-hit == 'true'
uses: mymindstorm/setup-emsdk@v7
with:
version: ${{ env.EMSCRIPTEN_VERSION }}
actions-cache-folder: ${{ env.EMSCRIPTEN_CACHE_FOLDER }}-${{ github.run_id }}
no-cache: true
The issue: With some build runs, it seems that the emscripten config changes during the run and clears the cache (triggered by its sanity check). This seems to happen unpredicatbly, i.e. sometimes it works without problems, sometimes the issue appears.
You can find a failing run in https://github.com/musicEnfanthen/verovio/actions/runs/322146695 . The sanity check info is thrown in: https://github.com/musicEnfanthen/verovio/runs/1292826609?check_suite_focus=true#step:8:55
Is this a known issue? Is there something wrong in our setup?
Thanks in advance.
Hi, thanks for the detailed report! This is not a known issue with the action. I think this is being caused by using latest
and caching together. Preferably, you should be declaring versions for builds with the cache and using builds without caching for latest
. This can be problematic in some situations e.g. if a cache is detected, the "latest" version will not be downloaded. I should probably have the action error if that type of config is specified.
Other things I noticed:
- This may be caused by appending
run_id
to the cache folder, the folder name changing may be causing emscripten to become confused between caches. - you don't need to have the if statement, the action will handle no cache folder existing just fine
- actions-cache-folder implies no-cache
This example config from the README should help:
env:
EM_VERSION: 1.39.18
EM_CACHE_FOLDER: 'emsdk-cache'
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Setup cache
id: cache-system-libraries
uses: actions/cache@v2
with:
path: ${{env.EM_CACHE_FOLDER}}
key: ${{env.EM_VERSION}}-${{ runner.os }}
- uses: mymindstorm/setup-emsdk@v7
with:
version: ${{env.EM_VERSION}}
actions-cache-folder: ${{env.EM_CACHE_FOLDER}}
- name: Build library
run: make -j2
- name: Run unit tests
run: make check
Thanks a lot for looking into this and your support!
Tried to set a specific version and to remove the runner_id
. And despite the fact, that emscripten still seems to clear the cache, the job does not fail anymore: https://github.com/musicEnfanthen/verovio/runs/1295567024?check_suite_focus=true#step:7:53
The idea behind appending runner_id
to the cache folder was to have a fresh emscripten build for every new run of the complete workflow. In the scenario from your README, we will install the specified emscripten version with the first workflow run, and then use it once and for all from cache until we manually change the version number (or cache-folder), right?
For latest
: Thanks for the hint, setting a specified version seems indeed to solve the issue (but it is hard to say since it occurs so unpredictably). Just one thought: In a time before gh-actions, we would have installed and activated emscripten from bash like so:
# Fetch the latest registry of available tools.
./emsdk update
# Download and install the latest SDK tools.
./emsdk install latest
# Set up the compiler configuration to point to the "latest" SDK.
./emsdk activate latest
what seems to be recommend by emscripten: https://emscripten.org/docs/tools_reference/emsdk.html#how-do-i-just-get-the-latest-sdk.
Do you think it would be feasable to support latest
? Otherwise you would always have to manually update the emscripten version, wouldn't you? But I see your point that it could be hard to detect, if the version coming from cache corresponds with latest
or not.
Thanks again.
The idea behind appending
runner_id
to the cache folder was to have a fresh emscripten build for every new run of the complete workflow. In the scenario from your README, we will install the specified emscripten version with the first workflow run, and then use it once and for all from cache until we manually change the version number (or cache-folder), right?
Sorry, I misrepresented your config and didn't notice that the runner ID was in the cache key. Your original config should have been fine.
Do you think it would be feasable to support
latest
? Otherwise you would always have to manually update the emscripten version, wouldn't you? But I see your point that it could be hard to detect, if the version coming from cache corresponds withlatest
or not.
This action is essentially a glorified version of that. Manually using a bash script and caching like you did should give a similar result.
- Is the
emcc -v
insetup_emscripten
generating a sanity check file that gets saved to the cache? - You could disable the sanity checks using EMCC_SKIP_SANITY_CHECK/EM_IGNORE_SANITY or use FROZEN_CACHE in emscripten config (sets cache to read-only, might be too much trouble)
- Run emcc verbosely, it prints the sanity check results to debug
- https://github.com/emscripten-core/emscripten/blob/6e2c28717380839dc5e7fdaebe122fca6d1120bb/tools/shared.py#L542-L551
Hi, sorry for the late catch up and thanks for your pointers.
We tried various things, but finally sticked with latest
version, while removing the runner id
from the cache-key. Since then, the issue did not occur anymore.
Thanks again for your great support!
Can be closed.
I'm glad you were able to figure out a solution! I'll keep this open for the time being just in case someone else has the same problem / a definite solution.