bolt icon indicating copy to clipboard operation
bolt copied to clipboard

Chunks

Open jamiebuilds opened this issue 7 years ago • 7 comments

There are generally a couple different ways to run tools in a Bolt repo:

  • Globally (across all workspaces at once)
  • By Workspace
    • Parallel
    • Chunks by dependency tree
    • Serial

When you have a tool that could either run globally or by workspace, you have to make a tradeoff decision:

  • Does the tool take a long startup time? By workspace is probably slower, having to startup over and over.
  • Does the tool have a long process time? Globally is probably slower, would benefit from parallelization.

Right now we don't offer anything in between these two extremes. The different "by workspace" modes don't help at all because they all spawn a new process for every workspace.

(./) $ tool "./packages/*"
(./packages/a) $ tool "."
(./packages/b) $ tool "."
(./packages/c) $ tool "."
(./packages/d) $ tool "."

But what if we could break workspaces into "chunks" and spawn a smaller number of processes?

(./) $ tool "./{packages/a,packages/b}"
(./) $ tool "./{packages/c,packages/d}"

This could allow us to increase parallelism of a tool while also reducing the number of processes that it starts up.

CLI

I would focus on just when the --parallel flag is on initially.

bolt workspaces exec --parallel --chunks -- tool "()"

If you wanted to limit the number of items to place in a chunk at a time:

bolt workspaces exec --parallel --chunks --max-chunk-size 10 -- tool "()"

jamiebuilds avatar Oct 08 '18 19:10 jamiebuilds

I already have a tool for most of the logic here too: https://github.com/jamiebuilds/chunkd

jamiebuilds avatar Oct 08 '18 19:10 jamiebuilds

Note: We're gonna need to have some syntax for passing the chunks into tools, we can't use env variables because they evaluate before we see them

jamiebuilds avatar Oct 08 '18 19:10 jamiebuilds

What do you mean by "before we can see them"?

Also, am.i right then in assuming this would require custom logic in each tool (or config of tool) to support it? Or do you have something else in mind?

lukebatchelor avatar Oct 08 '18 20:10 lukebatchelor

bolt workspaces exec --parallel --chunks -- tool $CHUNK

$CHUNK is evaluated to "" outside of the bolt process, so we just see ""

jamiebuilds avatar Oct 08 '18 20:10 jamiebuilds

Also, am.i right then in assuming this would require custom logic in each tool (or config of tool) to support it? Or do you have something else in mind?

Any tool that supports passing globs would support this. ESLint for example

jamiebuilds avatar Oct 08 '18 20:10 jamiebuilds

$CHUNK is evaluated to "" outside of the bolt process, so we just see ""

Ah yea, I was thinking about tools with a config.js which would mean bolt could set env vars when it spawns

lukebatchelor avatar Oct 08 '18 20:10 lukebatchelor

GNU parallel uses {} and others: https://www.gnu.org/software/parallel/parallel_tutorial.html#Replacement-strings

jamiebuilds avatar Oct 08 '18 22:10 jamiebuilds