bevy
bevy copied to clipboard
Use Leafwing-Studios/cargo-cache in CI
Objective
- Our CI caches the
~/.cargoandtargetfolders, to take advantage of faster, incremental, compilation. - The use of caches is quite inconsistent, with different formats for keys. It also does not take advantage of cache fallback (using an old cache if the current one does not exist).
- Caches are being created in the merge queue, but can never be used.
- To verify this, go to any validation job and check the
actions/cache@v4step. It will always sayCache not found for input keys: ....
- To verify this, go to any validation job and check the
Solution
- Switch to using Leafwing-Studios/cargo-cache, which automatically handles cache keys, fallback, dependency updates, and toolchain updates.
- Caches are no longer used in the merge queue.
- Github Actions has something called cache isolation. Long story short, caches cannot be used if they were not created from the same branch or base branch.
- For instance, a cache created on
maincan be used by any feature branch, but a cache created byfeature-abranch cannot be used by the unrelatedfeature-bbranch.
Changelog
- All CI caching is now done with Leafwing-Studios/cargo-cache.
- The CI no longer creates inaccessible caches for validation jobs.
You might be interested in this: https://github.com/Leafwing-Studios/cargo-cache
It uses a the hash of the updated lock file instead of the current date to calculate the hash key, also with several levels of fallbacks.
Honestly, debugging caching actions and figuring out why cargo keeps recompiling dependencies is super annoying, I hope that you'll get it to work!
You might be interested in this: https://github.com/Leafwing-Studios/cargo-cache
Thank you! I'll definitely take inspiration from that. I chose not to hash the lockfile because I figured that the amount of dependency change in 1 day cannot be so great that it would hurt performance, but that assumption will definitely require testing. It's good to see an alternate approach, though!
So far things are working well. I'm having some issues with the cache not being found, so I may have to move my testing over to my fork, where I can actually delete caches and cancel runs.
Building on MacOS and docs are the two longest jobs. I'm going to see if I can shorten their duration, if possible.
Following @TimJentzsch's interesting link, I've decided to switch this PR to use Leafwing-Studios/cargo-cache for everything. I think it's a lot more maintainable than my custom solution (and it's easy to use, too!).
Performance seems roughly the same, except for run-examples-macos-metal which didn't use caching before. It's hard to tell, since Github has two different MacOS runners with 3 and 4 CPU cores.
Previously attempted in #7627, but the action has also improved since then
Previously attempted in #7627, but the action has also improved since then
Is there a reason the original PR was closed?
Previously attempted in #7627, but the action has also improved since then
Is there a reason the original PR was closed?
Mainly because it didn't get any more reviews and I was closing out stale PRs that I had still open. I also found it hard to measure the impact of the change, it's hard to debug if a recompile is needed or unnecessary IMO
But I'm quite confident that the current caching solution is flawed, causing the cache to be outdated quite quickly, and that this PR will be an improvement.
I'm not entirely sure how well the action will work with as many different workflows and branches as we have in Bevy, the maximum cache side could be an issue. But at least in smaller projects (e.g. Emergence) we have used this action successfully and reduced CI times significantly.
I feel like the risk is acceptable. Leafwing Studios is a trusted third-party, and the argument against using it is the same as any other Github Action we depend on. What if dtolnay/rust-toolchain, Super Linter, Typos, cargo-release, or any other action we use started downloading all tokens and force-pushing to main?
While we cannot guarantee that our dependencies will never run untrusted code, we can ensure that that untrusted code will not be able to do anything harmful by hardening our security and restricting permissions.
This is not an argument to remove those dependencies. I think they provide too much value to our build system to remove them entirely. If security is a worry we can fork the actions, use them instead in CI, and occasionally merge upstream changes.
the maximum cache side could be an issue.
This maximum?
We almost never go below 50GB, sometimes over 100GB, GitHub seems... ok with that? That means we never have a cache that is more than one day old.
I think I would prefer an approach where we create cache only from the main branch, those should be available to every job
I think I would prefer an approach where we create cache only from the main branch, those should be available to every job
What would you think about handling this in a separate issue/PR? I think it could be tackled independently of the way we generate the cache and has its own up- and downsides that we could discuss seperately
What would you think about handling this in a separate issue/PR? I think it could be tackled independently of the way we generate the cache and has its own up- and downsides that we could discuss seperately
I agree. Let's get this merged, see the benefits / downsides, then work on an alternative if necessary. I think the cache should be purged directly after this is merged too, just to start fresh.
If / when this gets merged, it would be good to wipe all of the existing caches here. This will make the migration a little quicker, and hopefully show the performance improvements sooner.
Swapping to Controversial; I want to get Francois's sign-off here.
Update to the newly released v2 before merging.
I think I would prefer an approach where we create cache only from the main branch, those should be available to every job
We can now do this via the save-if option, like this:
- uses: Leafwing-Studios/cargo-cache@v2
with:
save-if: ${{ github.ref == format('refs/heads/{0}', github.event.repository.default_branch) }}
Done!
@mockersf, could you please review this? You're the last step before this can be merged :)
@mockersf, could you please review this? You're the last step before this can be merged :)
I was waiting for my previous review to be addressed! Now unless there's an urgent need I'll do it after the Jam 🙂