cucumber-js icon indicating copy to clipboard operation
cucumber-js copied to clipboard

BeforeAll hook is executed as many workers exist (in case of -parallel option)

Open znevrly opened this issue 6 years ago • 28 comments

Hi, using Cucumber 5.0.2 with --parallel 3 option

in BeforeAll hook I have simple console.log() command.

Actual behavior console.log() is executed so many as slave exists, in this case 3x

Expected behavior
BeforeAll hook should be executed only once, so console.log has just one output

or I am missing something?

znevrly avatar Nov 09 '18 16:11 znevrly

Each child process runs in its own node process and is basically an instance of cucumber executing only a subset of the total scenarios. Thus the slaves don't have access to anything the others do so each needs to call the before all hooks.

If the child processes were all running within the same node process, then I agree before all would only need to be called once.

While on this subject, do you think the docs need to be updated to mention any part of this conversation: https://github.com/cucumber/cucumber-js/blob/master/docs/cli.md#parallel-experimental

charlierudolph avatar Nov 10 '18 23:11 charlierudolph

So with current implementation is not possible to achieve it...can be this as option as beforeAll hook is executed by first/fastest child process and then all processes wait for finishing beforeAll and then continue? This will force actual name "beforeAll", imho should not care if scenarios are executed parallely or not.

Thanks.

znevrly avatar Nov 11 '18 07:11 znevrly

Do you have a use case where it executing on all slaves causes an issue?

charlierudolph avatar Nov 15 '18 17:11 charlierudolph

IMHO there are many, our case:

  1. We have important info console output in BeforeAll hook. It just looks odd to have it multiple times on console.
  2. We store some variables used later in scenarios in BeforeAll hook and they are fetched from endpoints with heavy logic so it takes some time. Really be glad to execute that logic only once.

I still think that BeforeAll implies that is executed before executing scenarios, no matter if they are executed sequentially or parallelly. It would be really nice if it's possible to implement that way.

Thanks.

znevrly avatar Nov 15 '18 17:11 znevrly

I have the same problem: I need to execute some API calls to prepare some data in hook that I want to reuse for all tests that I run in parallel. But with current implementation each node execute the same hook that I really need to avoid.

alexandrchumakin avatar Aug 14 '19 07:08 alexandrchumakin

It would be possible to have the master run the beforeAll but then nothing thats set up can be shared with any of the slaves as they currently run in another process.

We could potentially do something like the following: the master executes beforeAll and the upon spawning the slaves passes it some user configured data that are made available in some way. Would that solve the issue?

I'm not sure how we could do this in a way though that it works the same in multi-process vs single process.


For a current workaround, you could do your own syncing where the beforeAll content only runs on one slave (use the cucumber env vars to see the slave number) and other slaves have some way of knowing when that work is done (outside of cucumber, ie cache data in some file)

charlierudolph avatar Aug 17 '19 21:08 charlierudolph

"Would that solve the issue?" Yes, that could be solution. It would be great to see it in next major version. Thank you.

znevrly avatar Oct 15 '19 10:10 znevrly

Thinking on solution design for this, I don't want the interface to be parallel specific. Thoughts on something like this:

A new interface that allows users to provide a function that returns data that is passed to world constructors. This can be a nice way to setup data you want available to all scenarios either in serial or parallel mode?

import {setWorldSharedDataBuilder} from 'cucumber'

setWorldSharedDataBuilder(() =>. {
  // can be an async function too
  // return value will be passed to World constructors as "sharedData"
  return {'my': 'data'}; 
})

In parallel mode this would only be executed on the master node and sent to each slave.

Think this needs to be separate from BeforeAll / AfterAll in case you need to setup shared resources per slave (like webdrivers)

charlierudolph avatar Mar 12 '20 03:03 charlierudolph

@charlierudolph has this been implemented at all(perhaps on a different branch?).

Specifically I would like to spin up a server in the BeforeAll that runs only on master, waits for all the children to return, then shuts down.

I've tried specifying child 0 to spin up the server but there seems to be no guarantee as tests will still be running when the server shuts down.

sethtomy avatar May 06 '20 20:05 sethtomy

Hi @sethtomy, this has not been started to my knowledge

charlierudolph avatar May 07 '20 03:05 charlierudolph

Our use case is similar but slightly different, something like @charlierudolph 's setWorldSharedDataBuilder() proposal would solve it, but there would need to be a counterpart for after all child processes have finished.

I am using AfterAll() to do some custom processing of total results and notifications. E.g. 1 failed and 12 passed. This doesn't work with --parallel. Parallelization in cucumber-js is really important for our test suite because we use Ghost Inspector's on-demand API to execute steps in the After() pseudo-step (after each scenario). (This means individual steps never fail. Scenarios can only fail in the after-pseudo-step, as far as cucumber-js can tell.) Ghost Inspector supports very high parallelization. So I run our cucumber-js test suite with --parallel=99. (We don't have that many scenarios, so each scenario effectively gets its own process.) This works wonderfully, other than the notifications bug and incompatibility with --parallel for notifications.

AfterAll and BeforeAll are good names for the callback in the master process. I'd argue it is a bug that they are invoked for each child process. However if we can't change/fix that for backwards compatibility reasons, we could take advantage of the functions' existing options parameter. E.g.

AfterAll({ afterEachChild: false }, function () {
  // This is executed once, after all child processes have finished.
}}

AfterAll({ afterEachChild: true }, function () {
  // This is executed after EACH child process has finished. This is the current behaviour.
}}

BevanR avatar May 19 '20 02:05 BevanR

@charlierudolph Any thoughts on the above interface?

We are close to needing this enough to either contribute a fix, hack/fork to fix for ourselves, or put a bounty out. We'd much prefer to contribute an acceptable pull request.

BevanR avatar Jun 24 '20 02:06 BevanR

For custom processing of results, I'd suggest that be done in a separate process instead of being a part of cucumber. Otherwise, I think your proposal of adding an option makes sense. I would prefer something more like the following though:

import { AfterAll, HookParallelMode } from 'cucumber';

AfterAll(function () {
  // In parallel mode, executed on master and each slave (for backwards compatibility for now, good to change the default down the line or require parallel mode to be set when executing with parallel option)
});

AfterAll({ parallelMode: HookParallelMode.MASTER_ONLY  }, function () {
  // In parallel mode, executed only on master
});

AfterAll({ parallelMode: HookParallelMode.SLAVE_ONLY  }, function () {
  // In parallel mode, executed only on each slave
});

AfterAll({ parallelMode: HookParallelMode.MASTER_AND_SLAVE }, function () {
  // In parallel mode, executed on master and each slave
});

charlierudolph avatar Jun 24 '20 16:06 charlierudolph

In light of recent events I'd like to propose we adopt a different terminology than master/slave. My suggestion: primary/secondary.

It's all over the Internet, but here is a place to start: https://www.inputmag.com/culture/github-others-are-replacing-racist-terms-like-master-slave

aslakhellesoy avatar Jun 24 '20 16:06 aslakhellesoy

I quite like hive/worker for the parallel thing

davidjgoss avatar Jun 24 '20 17:06 davidjgoss

Many of the alternatives are also use socially charged terms. I think my favourite is minion!

There is merit when you replace a negatively charged term with something that involves a bit of levity.

Or just worker if we wanted something more neutral. hive doesn't make sense to me for the coordinator. queenbee would make more sense, but I don't think it is a good idea. coordinator is a good neutral term that describes the responsibilities clearly.

BevanR avatar Jun 24 '20 22:06 BevanR

@charlierudolph Would you mind elaborating a little more on this idea please?

I'd suggest that be done in a separate process instead of being a part of cucumber

BevanR avatar Jun 24 '20 22:06 BevanR

I like coordinator / worker. I'll make updates to the code for that.

I'd suggest that be done in a separate process instead of being a part of cucumber

For this, I was thinking essentially something like a shell script that runs cucumber and after it exits, use another program for parsing the results (not totally sure what you are processing)

charlierudolph avatar Jun 25 '20 04:06 charlierudolph

The details we need in our results (URLs for Ghost Inspector result pages) are not available in Cucumber results objects. (Is the Cucumber result schema is extensible?) These URLs become available in the After step.

The approach you describe would require storing the URLs somewhere in each after step, then reading it elsewhere once cucumber is complete. I was hoping to avoid that. But it is the workaround we have in mind if we can't easily contribute a solution to this in Cucumber JS.

BevanR avatar Jun 25 '20 04:06 BevanR

@charlierudolph Would you (in principle) accept a pull request that implements your proposal from a few comments ago, but with coordinator/worker instead of master/slave?

I.e.

import { AfterAll, HookParallelMode } from 'cucumber';

AfterAll(function () {
  // In parallel mode, executed on master and each slave (for backwards compatibility for now, good to change the default down the line or require parallel mode to be set when executing with parallel option)
});

AfterAll({ parallelMode: HookParallelMode.COORDINATOR_ONLY  }, function () {
  // In parallel mode, executed only on the coordinator, after all workers have completed.
});

AfterAll({ parallelMode: HookParallelMode.WORKER_ONLY  }, function () {
  // In parallel mode, executed only on each worker after it has completed.
});

AfterAll({ parallelMode: HookParallelMode.COORDINATOR_AND_WORKER }, function () {
  // In parallel mode, executed on the coordinator after all workers have completed, and on each worker after it has completed.
});

// In serial mode, these are all equivalent; each callback is executed once all scenarios have completed.

BevanR avatar Jul 14 '20 00:07 BevanR

@charlierudolph Would you (in principle) accept a pull request that implements your proposal from a few comments ago, but with coordinator/worker instead of master/slave?

Yep

charlierudolph avatar Jul 14 '20 02:07 charlierudolph

Hi guys @charlierudolph There is an update about this?

fescobar avatar Jul 22 '21 09:07 fescobar

Hi guys @charlierudolph There is an update about this?

As far as I know, no. It seems no PR has been submitted yet :(

aurelien-reeves avatar Jul 22 '21 09:07 aurelien-reeves

Hi guys. I have the same issue - clean up some folders with test artifacts in BeforeAll / AfterAll hooks and got the following error - unlinking / deletion of the same folders / files occurred in different workers in the same time:

VError: Unexpected error on worker.receiveMessage: a BeforeAll hook errored on worker 0, process exiting: e2e\step_definitions\_playwright\support\common-hooks.ts:65: EPERM: operation not permitted, unlink 'C:\testautomation\e2e\test_artifacts\logs\TestAutomation.log'
    at exit (C:\testautomation\node_modules\@cucumber\cucumber\src\runtime\parallel\run_worker.ts:8:38)
    at C:\testautomation\node_modules\@cucumber\cucumber\src\runtime\parallel\run_worker.ts:22:9
caused by: VError: a BeforeAll hook errored on worker 0, process exiting: e2e\step_definitions\_playwright\support\common-hooks.ts:65: EPERM: operation not permitted, unlink 'C:\testautomation\e2e\test_artifacts\logs\TestAutomation.log'
    at Worker.runTestRunHooks (C:\testautomation\node_modules\@cucumber\cucumber\src\runtime\run_test_run_hooks.ts:32:19)
    at Worker.initialize (C:\testautomation\node_modules\@cucumber\cucumber\src\runtime\parallel\worker.ts:93:5)
    at Worker.receiveMessage (C:\testautomation\node_modules\@cucumber\cucumber\src\runtime\parallel\worker.ts:113:7)

Do you have any updates? From my point of view in some cases it makes no sense to run BeforeAll / AfterAll hooks for every parallel run of the same test suite :)

RockMinsk avatar Jul 08 '22 17:07 RockMinsk

I'm using if (process.env.CUCUMBER_WORKER_ID === '0') { ... } condition for actions in BeforeAll / AfterAll hooks that should be executed once (documentation - https://github.com/cucumber/cucumber-js/blob/main/docs/parallel.md) Thanks )

P.S. or even better to use if (!+process.env.CUCUMBER_WORKER_ID) { ... } condition because if you don't use parallel runs process.env.CUCUMBER_WORKER_ID = undefined

RockMinsk avatar Jul 08 '22 19:07 RockMinsk

I'm using if (process.env.CUCUMBER_WORKER_ID === '0') { ... } condition for actions in BeforeAll / AfterAll hooks that should be executed once

Note, this is not guaranteed to work (ie other workers won't necessarily wait for it to finish if its a async process).

charlierudolph avatar Jul 13 '22 03:07 charlierudolph

Agreed, I also use pause for other workers while actions that are common to all workers are completed in BeforeAll hook.

RockMinsk avatar Jul 13 '22 06:07 RockMinsk

As a workaround, when the worker is 0 I create a specific file with a specific name (empty file). And when the worker is different than 0 I wait for that file to be created. In that way, the rest of the workers will have to wait until worker 0 finished creating that file.

fescobar avatar Jul 13 '22 22:07 fescobar

Hi @RockMinsk and @fescobar,

How have you stopped or paused the other worker process while the first worker is run in BeforeAll hook?

Thank you in advance.

enespekkaya avatar Apr 27 '23 12:04 enespekkaya

@enespekkaya In my case, the first worker process is responsible for creating the file. Meanwhile, other processes check if that file exists, if the file doesn't exist, that process keeps checking until the first worker creates that file. That verification is made every X seconds.

fescobar avatar Apr 27 '23 15:04 fescobar