cucumber-js Improve execution time by executing on multiple machines

🤔 What's the problem you're trying to solve?

We love cucumber and use it for end-to-end testing our UI. As the tests include the UI, backend and database, they are quite "heavy" and are taking more and more time. We would like to decrease the execution time, especially as they are used in our pull-request CI/CD.

✨ What's your proposed solution?

Another quick way to speed up tests is to split them across multiple machines. I believe Cucumber-Js currently does not support this, correct? Or is there any third-party package I did not find?

We would love that feature. Nowadays, hardware is getting more cheap, and you can so easily speed up the workflow by paying little more money.

⛏ Have you considered any alternatives or workarounds?

Cucumber-js supports the parallel parameter, but our pull-request provider (Atlassian Bitbucket) only supports a limited number of (not very strong) machines, so we currently cannot go beyond 2 for the parallel parameter.

Jul 25 '23 13:07 MiladSadinam

@MiladSadinam thanks for raising, you'r right that it's currently not supported. parallel just uses multiple worker threads but on the same machine which for reasons you've noted has limitations.

From what you've written, I think the kind of thing you're looking for is more like Playwright's sharding feature, where an independent run would execute a subset of test cases based on an index and count, e.g.:

# machine 1
npx cucumber-js --shard 1/3

# machine 2
npx cucumber-js --shard 2/3

# machine 3
npx cucumber-js --shard 3/3

I'd definitely be supportive of doing this as a relatively low-effort way to give people another lever for performance.

Some thoughts about how this should work (more for internal audience):

When do we run the logic to get the subset of test cases for this shard? We should be able to do it once we have derived the pickles. Presumably then we wouldn't emit testCase messages for the ones we aren't going to run. I think we probably couldn't avoid emitting the pickle message if we wanted to (without a significant rework), since that is done by Gherkin before any of the Cucumber processing kicks in.
Once we release this, people are (reasonably) going to want to be able to e.g. concatenate the messages from the multiple sharded runs and produce a single HTML report. I doubt we would support this right away, but we should avoid doing anything that would block it later. We've already taken some small steps towards this with e.g. making testRunStarted messages have a unique id, so it's not totally unrealistic.
The parallel option should still work in the context of one shard.
The usual filtering mechanisms (files, lines, tags, names) should all happen as normal before the sharding logic.
Plugins that e.g. influence the pickle order or do further filtering should also happen before the sharding logic.
This should be an option on the API, not just the CLI.

Jul 30 '23 15:07 davidjgoss

Thank you for your feedback. Yes, the sharding feature would exactly be what I would like. As feedback from our side, we would not care about merging of the reports.

I was wondering how I could achieve a similar result until you guys hopefully implement this. My idea would be something like:

# machine 1
cucumber-js --name "^[a-m].*$"

#machine 2
cucumber-js --name "^[n-z].*$"

I would need to tweak the limit between the two machines, to get similar execution times, but otherwise it should work.

Jul 31 '23 08:07 MiladSadinam

Yep, that might be worth a try.

Jul 31 '23 18:07 davidjgoss

Just noting we could also explore doing this as a plugin, with the new plugins concept.

Aug 01 '23 10:08 davidjgoss

cucumber-js cucumber-js copied to clipboard

Improve execution time by executing on multiple machines

🤔 What's the problem you're trying to solve?

✨ What's your proposed solution?

⛏ Have you considered any alternatives or workarounds?

cucumber-js
cucumber-js copied to clipboard