rushstack icon indicating copy to clipboard operation
rushstack copied to clipboard

[rush] feat(cobuilds): allow orchestration without using the build cache

Open aramissennyeydd opened this issue 1 year ago • 2 comments

Summary

From my thread in Zulip, here.

Currently, cobuilds rebuild un-cacheable operations across all machines. This is the expected behavior for core nodes that aren't cacheable to ensure artifacts are found for projects further up the tree. For operations like uploading/processing coverage or image building, that approach doesn't apply well and ends up in (number of machines)x more work being done. This PR attempts to add a new experiment/feature that allows users to use the cobuild orchestration engine without relying on the build cache. Inherently, this introduces risk for projects that enable this without knowing what it does as you'll be explicitly skipping build cache restoration, but I think the benefits outweigh the drawbacks here.

Details

I tried to add this similarly to how cobuild leaf-only projects are enabled. This ends up being a pretty small change with decent impact. Shards are an interesting case here that should probably be explicitly restricted since the output needs to be shared across all agents if you have a collate step, but it was an easy way to get many events to test with.

Before

Screenshot 2024-08-08 at 7 05 57 PM

After

Screenshot 2024-08-08 at 7 03 46 PM

How it was tested

Tested against the sharded-repo using

rm -rf common/temp/build-cache && RUSH_COBUILD_CONTEXT_ID=foo REDIS_PASS=redis123 RUSH_COBUILD_RUNNER_ID=runner1 RUSH_COBUILD_ORCHESTRATION_ONLY_ALLOWED=1 node ../../lib/runRush.js cobuild -p 10 --timeline

to validate that operations were being shared across the 2 machines.

and the same command without the RUSH_COBUILD_ORCHESTRATION_ONLY_ALLOWED env var set,

rm -rf common/temp/build-cache && RUSH_COBUILD_CONTEXT_ID=foo REDIS_PASS=redis123 RUSH_COBUILD_RUNNER_ID=runner1 node ../../lib/runRush.js cobuild -p 10 --timeline

for the negative cases.

Impacted documentation

It adds a new experiment that will likely need to end up on the docs and updates the rush-project JSON schema.

aramissennyeydd avatar Aug 08 '24 23:08 aramissennyeydd

The terminology in here needs a lot of clarification. You are trying to identify operations that have side effects that need to occur exactly once in an orchestrated build, but that have no outputs that impact downstream operations, correct?

dmichon-msft avatar Aug 13 '24 17:08 dmichon-msft

@dmichon-msft 😅 I was confusing myself writing the original post. Yes, that sounds right, rephrasing a bit for my understanding.

I'm looking to use cobuilds (orchestrated builds) on operations that

  1. do things (upload coverage, write to S3) which I think is what you call side effects
  2. do not write to the build cache, but may consume previous stage artifacts, ie coverage, build artifacts
  3. need to run exactly once per build, we cannot tolerate each operation running per agent
  4. need to run on specific commits, ie merge queue HEAD - the side-effect needs to be executed on those specific commits

aramissennyeydd avatar Aug 13 '24 19:08 aramissennyeydd

@iclanton Gentle bump here 🙏

aramissennyeydd avatar Sep 10 '24 15:09 aramissennyeydd