lefthook
lefthook copied to clipboard
Using Parallel and Piped options at once
I ran into a situation where I wanted to run parallel and piped commands.
So essentially on a post-merge, I'm hoping to capture any dependency updates and any migrations.
post-merge:
parallel: true
piped: true
commands:
yarn:
files: git diff --name-only HEAD master
glob: '{package.json,yarn.lock}'
run: yarn
tags: frontend
1_gem:
files: git diff --name-only HEAD master
glob: '{GEMFILE,GEMFILE.lock}'
run: bundle exec bundle check || bundle install
tags: backend
2_migrate:
files: git diff --name-only HEAD master
glob: '{db/migrate/*}'
run: bundle exec rails db:migrate && bundle exec rails db:test:prepare
tags: backend
So, I want to run the FE and BE commands in parallel, because neither of them depend on the output of the other.
I can't run a migration unless my bundle check passes, so I want it piped after a bundle check || bundle install. I could run them all piped, but they don't need to be and I figure lefthook could be smart enough to handle both, basically any commands without the prepended numbering system are run in parallel and commands with the numbering system are piped.
Hello @dlinch, thank you for issue!
any commands without the prepended numbering system are run in parallel and commands with the numbering system are piped.
I think it a good point. The main question how we can separate different piped groups. In my mind something like this:
post-merge:
parallel: true
commands:
yarn:
run: yarn
1_gem:
run: bundle exec bundle check || bundle install
pipe_group: 1 // <-- how we separate piped groups
2_migrate:
run: bundle exec rails db:migrate && bundle exec rails db:test:prepare
pipe_group: 1 // <-- how we separate piped groups
1_another_gem:
run: bundle exec bundle check || bundle install
pipe_group: 2 // <-- how we separate piped groups
2_another_migrate:
run: bundle exec rails db:migrate && bundle exec rails db:test:prepare
pipe_group: 2 // <-- how we separate piped groups
So we have 3 parallel groups of commands
- yarn
- 1_gem >> 2_migrate
- 1_another_gem >> 2_another_migrate
Thanks @Arkweid!
I would say that almost seems like two different features. One allows for parallel commands with a single piped group, and the other is to handle multiple pipe groups correct?
There's nothing I can find in the full guide that shows multiple piped groups as a feature.
I would say that almost seems like two different features. One allows for parallel commands with a single piped group, and the other is to handle multiple pipe groups correct?
Yep.
There's nothing I can find in the full guide that shows multiple piped groups as a feature.
It not implemented yet. I just thinking about it.
@Arkweid I suggest to generalize your proposal by considering three different behaviors: sequential
, parallel
, piped
. sequential
would be the default, and would run all tasks in series, if one fails the next ones would be run anyway. parallel
would run tasks in parallel (of course). piped
would behave like sequential
, but would stop at the first failing task.
Then it should be possible to separately specify a global behavior and group behaviors. As or the global behavior, without any specification (default) it would be sequential
, otherwise parallel
or piped
.
Group behavior would customize the global specification for the given group, using (following your proposal): sequential_group = <key>
, parallel_group = <key>
, piped_group = <key>
.
So for instance:
post-merge:
parallel: true
commands:
yarn:
run: yarn
1_gem:
run: bundle exec bundle check || bundle install
piped_group: something
2_migrate:
run: bundle exec rails db:migrate && bundle exec rails db:test:prepare
piped_group: something
1_another_gem:
run: bundle exec bundle check || bundle install
sequential_group: something_else
2_another_migrate:
run: bundle exec rails db:migrate && bundle exec rails db:test:prepare
sequential_group: something_else
With this example, lefthook
would first start in parallel the yarn
, gem
, and another_gem
tasks. Task migrate
would then be run only if gem
is successful, while task another_migrate
would always be run after another_gem
.
This would also close issue #113.
A better option would be to allow the definition of command groups, each one with its own behavior (sequential by default, or otherwise parallel or piped). Apart from fixing this issue and #113, this would also allow reusing defined groups among different git hooks.
The example above would become something like:
post-merge:
parallel: true
commands:
yarn:
run: yarn
do_something:
run_group: a_group
do_something_else:
run_group: another_group
a_group:
piped: true
commands:
1_gem:
run: bundle exec bundle check || bundle install
2_migrate:
run: bundle exec rails db:migrate && bundle exec rails db:test:prepare
another_group:
commands:
1_another_gem:
run: bundle exec bundle check || bundle install
2_another_migrate:
run: bundle exec rails db:migrate && bundle exec rails db:test:prepare
For this to work for both commands and scripts, allowing to control relative order and parallelization, the scripts
variable should also be removed. Scripts would simply be listed under commands
, the only difference would be the runner
field instead of run
.
My team would also love this feature. It'd be very useful for the common case to "do a bunch of style fix-ups and checks to make sure generated files are up to date, some of which need to be done in sequence but most of which can be done in parallel".
As far as I can tell, what we're basically trying to create here is a dependency graph that lets each command know when it can safely run. CircleCI does this in a YAML file quite elegantly, I feel, with the following syntax:
workflows:
version: 2
build-and-test:
jobs:
- warm_backend_cache
- warm_frontend_cache
- run_cypress_tests:
requires:
- warm_backend_cache
- warm_frontend_cache
- run_other_tests:
requires:
- warm_backend_cache
- warm_frontend_cache
- lighthouseci:
requires:
- warm_backend_cache
- warm_frontend_cache
- percy/finalize_all:
requires:
- run_cypress_tests
Basically:
- Each job has a unique name and appears in a list of jobs to run
- By default, each job is assumed to be completely parallel-safe
- Every job can add a
requires
key, which lists all other jobs that must complete successfully before this job can safely run
The things I like about this are:
- It's very readable
- It's very easy to explain and keep in one's head ("all jobs run by default in parallel: if some job depends on another, list it in the
requires
key) - It's very flexible
- It has a reasonable default behavior if you omit the
requires
altogether
IMO, this approach is lighter weight and easier to grok than command groups or piped groups. I think that this approach could implement most of what's possible above with a little more readable syntax.
Hey! Did you know there is a possibility to use lefthook inside lefthook? Just like in the following example
pre-commit:
parallel: true
commands:
pipe1:
run: LEFTHOOK_QUIET=meta,success lefthook run pipe1
pipe2:
run: LEFTHOOK_QUIET=meta,success lefthook run pipe2
pipe1:
piped: true
commands:
1_run:
run: echo pipe1_1
2_run:
run: echo pipe1_2
pipe2:
piped: true
commands:
1_run:
run: echo pipe2_1
2_run:
run: echo pipe2_2
Of course it may not look very cool (but I think I can add tweaks on output settings). But it is possible :)
I don't want to bring more complexity to lefthook configuration, because it is already quite complicated. And I think this approach is the most convenient, and I don't think it brings too much latency with loading lefthook executable many times.
As long as this issue is rather stale I am closing it. But feel free to open a new discussion, and maybe we will end up with a new feature or at least a problem solving :)
I am the original opener of this issue! I think this is a perfectly fine solution, it seems obvious now that you've posted it, which is often true of the best solutions. As long as the output is readable I think this totally fair.
Thank you!
Very helpful, thank you! 🙏
Maybe it is worth adding this to the doc?