iree icon indicating copy to clipboard operation
iree copied to clipboard

Experiment with cache settings across CI workflows

Open ScottTodd opened this issue 1 year ago • 2 comments

I've switched several CI workflows from using remote ccache storage in a GCS bucket to using "local" ccache storage in GitHub Actions caches, mainly by using https://github.com/hendrikmuhs/ccache-action. A few adjustments to the caching strategy could save minutes per workflow run.

Note that Bazel uses a global mutable cache (i.e. shared across many commits): https://bazel.build/remote/caching. The default way to use ccache with GitHub Actions from that ccache-action creates immutable cache entries and cache hits within a build are unlikely if the selected cache entry was from a substantially different commit (such as whenever the LLVM commit pin changes).

Experiments to try:

  • [ ] Evaluate https://github.com/mozilla/sccache . ccache-action has a variant: sccache mode but that might not be enough
  • [ ] Change when the cache is saved vs restored. I set new jobs to only save on push events, but pkgci_build_packages always saves the cache, even on pull_request events. Saving the full compiler build cache takes 2m30s to upload 800MB, which can be a bottleneck before other jobs are queued up
  • [ ] Use different cache keys - maybe mix the LLVM commit hash into the key (instead of the commit ref itself?), to limit the number of cache entries generated and increase the likelihood of a cache hit actually being useful
  • [ ] Check cache hit rate across platforms. Linux x64 clang generally has high cache hit rates, but Windows x64 MSVC and Linux arm64 clang may have more cache misses due to compiler options / unsupported features in ccache / cmake / the compiler.
  • [ ] Prepopulate caches on (persistent) runners - ccache, git submodules / a partial git checkout, docker images, python deps, etc.

ScottTodd avatar Aug 09 '24 23:08 ScottTodd

I'll add my own idea here: somehow orchestrate runners to run distcc/icecream.

makslevental avatar Aug 29 '24 21:08 makslevental

Just jotting this down for posterity/future reference; these settings seem to universally beneficial for increasing hits:

echo CCACHE_SLOPPINESS=include_file_ctime,include_file_mtime,time_macros >> $GITHUB_ENV

while this setting

echo CCACHE_COMPILERCHECK=content >> $GITHUB_ENV

only helps on linux and windows but on mac has the opposite effect (increasing misses), even though it's supposed to also work there as well.

In general you can set CCACHE_COMPILERCHECK to string:<IDSTRING> (see here) and probably a good candidate for <IDSTRING> is the output of clang -v (or whatever).

makslevental avatar Aug 29 '24 23:08 makslevental