rushstack
rushstack copied to clipboard
[rush] feat(cobuilds): allow orchestration without using the build cache
Summary
From my thread in Zulip, here.
Currently, cobuilds rebuild un-cacheable operations across all machines. This is the expected behavior for core nodes that aren't cacheable to ensure artifacts are found for projects further up the tree. For operations like uploading/processing coverage or image building, that approach doesn't apply well and ends up in (number of machines)x more work being done. This PR attempts to add a new experiment/feature that allows users to use the cobuild orchestration engine without relying on the build cache. Inherently, this introduces risk for projects that enable this without knowing what it does as you'll be explicitly skipping build cache restoration, but I think the benefits outweigh the drawbacks here.
Details
I tried to add this similarly to how cobuild leaf-only projects are enabled. This ends up being a pretty small change with decent impact. Shards are an interesting case here that should probably be explicitly restricted since the output needs to be shared across all agents if you have a collate step, but it was an easy way to get many events to test with.
Before
After
How it was tested
Tested against the sharded-repo using
rm -rf common/temp/build-cache && RUSH_COBUILD_CONTEXT_ID=foo REDIS_PASS=redis123 RUSH_COBUILD_RUNNER_ID=runner1 RUSH_COBUILD_ORCHESTRATION_ONLY_ALLOWED=1 node ../../lib/runRush.js cobuild -p 10 --timeline
to validate that operations were being shared across the 2 machines.
and the same command without the RUSH_COBUILD_ORCHESTRATION_ONLY_ALLOWED env var set,
rm -rf common/temp/build-cache && RUSH_COBUILD_CONTEXT_ID=foo REDIS_PASS=redis123 RUSH_COBUILD_RUNNER_ID=runner1 node ../../lib/runRush.js cobuild -p 10 --timeline
for the negative cases.
Impacted documentation
It adds a new experiment that will likely need to end up on the docs and updates the rush-project JSON schema.
The terminology in here needs a lot of clarification. You are trying to identify operations that have side effects that need to occur exactly once in an orchestrated build, but that have no outputs that impact downstream operations, correct?
@dmichon-msft 😅 I was confusing myself writing the original post. Yes, that sounds right, rephrasing a bit for my understanding.
I'm looking to use cobuilds (orchestrated builds) on operations that
- do things (upload coverage, write to S3) which I think is what you call side effects
- do not write to the build cache, but may consume previous stage artifacts, ie coverage, build artifacts
- need to run exactly once per build, we cannot tolerate each operation running per agent
- need to run on specific commits, ie merge queue HEAD - the side-effect needs to be executed on those specific commits
@iclanton Gentle bump here 🙏