core
core copied to clipboard
high usage of github hosted runners
Our current matrix of configuration options has increased our github hosted runner usage from about 24 minutes (https://github.com/SCOREC/core/actions/runs/13558913948/usage) three months ago to 2 hours and 10 minutes (https://github.com/SCOREC/core/actions/runs/15589036256/usage).
A couple of thoughts:
- do we need all these combinations?
- are we caching dependencies?
- would switching to ninja (from make) help? (thx @bobpaw)
- should we move the majority of these tests to a self-hosted runner with a manual trigger (i.e.,
/runtests)? If we run on the fast filesystem and use more cpu cores for the builds and tests we may come out ahead (vs the elapsed github hosted time of ~10mins). If we take this approach then the automatic tests could be reduced to a small subset.
Using ccache will also help
cool! https://ccache.dev/ Assuming each matrix entry/combination is running in its own vm instance, I wonder if a 'central' ccache could be used.
It's been a few years since I used the GitHub actions ccache features, but I think the usual approach is to load the cache as an artifact before building and write it afterward. I'm sure there's a good guide out there.
We can also make better use of the on.*.paths/on.*.paths-ignore key to avoid running CI for documentation-only updates. The only thing there is to make sure that .github/workflows is in there so that changes to the action itself triggers runs.
There is a guide on ccache here: https://cristianadam.eu/20200113/speeding-up-c-plus-plus-github-actions-using-ccache/
There might be a storage tradeoff/consideration, especially since our builds tend to be large.
Working on #500 to add ccache, but we should also look at pruning the run matrix. Specifically:
- Do we need to run clang/gcc for every combination? Or only for debugging to make use of different warnings?
- Do we need to test C++20 with all configs?
Thank you; much appreciated.
No, we don't need clang/gcc and C++20 for all combinations.
Is this about what we want? Not sure when we do want them.
exclude:
- compiler: { name: LLVM }
build_type: Release
- cxx_standard: 20
build_type: Release
develop currently has:
matrix:
compiler:
- { name: GNU, CC: gcc-10, CXX: g++-10 }
- { name: LLVM, CC: clang, CXX: clang++ }
build_type: [Debug, Release]
no_mpi: [OFF, ON]
cxx_standard: [11, 20]
metis: [OFF, ON]
From these docs: https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/running-variations-of-jobs-in-a-workflow#excluding-matrix-configurations my understanding is that your exclude statement would remove any matches with llvm+release and any Cxx20+release combinations. IIUC, that would bring us from 32 configs to 20. That seems like a good start and with your ccache work I suspect we'll be in good shape. 👍
The current .github/workflows/cmake.yml takes 31minutes to run. https://github.com/SCOREC/core/actions/runs/16089922393/usage This is low enough to not be concerned about at this point.