turbo icon indicating copy to clipboard operation
turbo copied to clipboard

Provide preflight command to precalculate fingerprints and determine cache hit/miss

Open sppatel opened this issue 3 years ago • 5 comments

Describe the feature you'd like to request

We've got a repo with 200+ packages - each package has a large number of tests. On CI (GHA) we run a matrix configuration splitting the test execution across N number of shards. Each shard is distributed evenly in terms of expected execution time using a jest timings report that we published.

This guarantees us a ceiling for each chunk's execution, but does not guarantee equal execution time across all chunks. This is because during incremental builds one chunk's execution could outweigh another (due to cache hits). (i.e you only get equal execution time when a large number of packages need to be executed - but when you have cache hits the distribution remains skewed because ALL packages are considered during distribution.)

What we would like is to take our logic and distribute the packages to each shard based on expected cache misses.

This in turn would get us both a guaranteed ceiling and equal execution time across all jobs for each shard.

Describe the solution you'd like

Extend dry run or provide a different command altogether that runs a given task but only reports on an expected cache hit or miss without actually execution the task. This json output could then be used to accomplish true balancing when sharding.

Describe alternatives you've considered

No alternative.

sppatel avatar Jun 24 '22 17:06 sppatel

Extending --dry-run=json seems like a good idea for this. Curious about your approach of sharding it though. Sounds interesting, any plans to share snippets of it? I think this could also be used in Gitlab CI

weyert avatar Jun 27 '22 18:06 weyert

Agreed re: --dry-run. I think that would be a good way to tackle this.

gsoltis avatar Jun 27 '22 19:06 gsoltis

Extending --dry-run=json seems like a good idea for this. Curious about your approach of sharding it though. Sounds interesting, any plans to share snippets of it? I think this could also be used in Gitlab CI

Let me get back to you on this.

sppatel avatar Jun 29 '22 18:06 sppatel

fyi - I have received approval to author a blog and examples. Stay tuned!

sppatel avatar Jul 21 '22 11:07 sppatel

@weyert / @gsoltis here you go.. https://medium.com/@sppatel/maximizing-job-parallelization-in-ci-workflows-with-jest-and-turborepo-da86b9be0ee6

Hopefully this gives the request a little bit more meaning and substance.

sppatel avatar Aug 05 '22 16:08 sppatel