ginkgo
ginkgo copied to clipboard
Add support for `BeforeAll` in unordered containers
In certain cases, there is some setup which needs to be done for a container and is used for all specs in that container.
Right now the options are either marking that container as Ordered
, which prevents parallelizing or randomizing the container specs, or running the setup in BeforeEach
which will cause it to run once for every spec. In some cases BeforeEach
is not a good option, e.g. when the setup takes non-trivial time to run or requires some resources from external systems.
Please add support for defining BeforeAll
/AfterAll
blocks for non-ordered containers.
hey @tbrisker - this is something i'm considering. Ordered
containers came about in part because Ginkgo-users were asking for a BeforeAll
-like behavior and in part because folks wanted to break large It
s into smaller pieces.
I appreciate there are some cases where a set of specs are independent save for some shared reusable setup and teardown that is too expensive to perform for each spec. Such specs can be randomized and parallelized.
There is some degree of complexity here - akin to the complexity that led to the development of SynchronizedBeforeSuite
/SynchronizedAfterSuite
- if information must flow from the process that performs the singleton setup to all the other processes that consume it (e.g. the id of the resource that was created) then a mechanism for communicating and synchronizing that information needs to be provided to the user. There's also complexity around keeping track of the lifecycle of this class of specs. Only the first spec should run the setup (that's easy enough) but the setup should only be torn down when all associated specs have completed. In a context of parallelization and randomization - this can happen at the very end of the suite. In fact, this can begin to open up some potentially tricky-to-reason about resource contention issues. Imagine a suite comprised of many such singleton resources. There will exist some parallel+random orders that could see a small fraction of these singleton resources at play at once... and some that would see a large fraction of them at play. That could make the overall resource usage a bit trickier to reason about for the end-user (today it's relatively straightforward: the number of parallel processes as a decent proxy for the amount of parallel load on the system under test. in a world where some resources survive across multiple unrelated specs while waiting for the last associated spec to run the load can be harder to reason about).
Requiring Ordered
specs was as simplifying assumption that solved all these issues. There's no need for synchronization and no concern of wildly varying loads as the related set of specs are guaranteed to run on the same Ginkgo process in series.
None of this is to say I'm not open to solving for this problem. Just that there is some complexity here and I can imagine some of the future issues that users will open when they bump into these things!
Another consideration is whether the current imperative model (do this setup BeforeAll
and then this teardown AfterAll
) is the best approach for this burgeoning complexity. I can imagine other models (though I admit I have not fleshed them out - so they may well be non-starters). For example, a spec can specify which resource it needs and a separate resource manager could be assigned that can ensure that resource exists for the lifecycle of the spec. Such managers could then be configured in different ways - there could be Recreate
managers that recreate the resource for each spec that needs it (equivalent to BeforeEach/AfterEach
) or Reuse
managers that lazily create the resource but then only destroy it when all associated specs are finished (or, more simply, when the suite ends) - this would be more equivalent to BeforeAll/AfterAll
.
I'm happy to continue the conversation here with anyone who's interested. To be transparent: with the kids out of school soon I'm unlikely to get to this until the fall - so there's time to think about approaches.
A BeforeAll
/AfterAll
block for unordered would be super helpful for people to reason their test execution, but for our use cases we were able to move that logic to the SynchronizedBeforeSuite/SynchronizedAfterSuite
functions with some little test refactor. Luckily we didn't need nested BeforeAll
blocks.
@tbrisker have you looked at SynchronizedBeforeSuite
and would that help your use case?
@onsi I appreciate the deep explanation, I'm sure it isn't a trivial thing to implement. Unfortunately I'm not familiar enough with the internals of ginkgo to be able to suggest how to approach it or propose a fix myself.
@ebabani thanks, we have test suites which run hundreds of tests grouped in multiple containers according to domains, we do have BeforeSuite
/AfterSuite
for global setup but there are some things which are only relevant for a specific group of test cases.
Unfortunately I'm not familiar enough with the internals of ginkgo to be able to suggest how to approach it or propose a fix myself.
no worries, @tbrisker - rather than dive too deep into solutions together I think it would be most helpful for me to continue to explore the problem so that I can wrap my head around if/what would make sense to implement.
we have test suites which run hundreds of tests grouped ... there are some things which are only relevant for a specific group of test cases.
Those are interesting facts. Some questions:
- With what degree of parallelism do you run (i.e. what do you set
ginkgo -procs=N
to?)? - How many of these groups of specs could/do use a
BeforeAll
but are in fact internally parallelizable? (50%? all of them? 7 of them?) - Are you using
Ordered
containers for all of these today? Or do some of them use aBeforeEach
and pay the performance hit of repeated setup and teardown of the same resource for each spec? - How important is it that the resources used by a group are cleaned up when all the specs in the group complete? What would happen if cleanup was, instead, deferred to the end of the suite?
- Is part of the problem you are trying to solve improved performance (e.g. are you actually experiencing this: "we are running with
N=20
but end up with just five processes running some bigOrdered
Describes at the end. It would be great if we could get all 20 workers to participate on those specs since we know they're actually parallelizable")?
No doubt I'll come up with more - but it would be helpful to understand the contours of the problem in some more detail so I can wrap my head around it. There may be solutions that are in reach today.
With what degree of parallelism do you run (i.e. what do you set ginkgo -procs=N to?)?
At the moment we don't run them in parallel at all, as they weren't written with parallelism in mind (e.g. some global state that is not thread-safe). It's a long term goal to get there in the future.
How many of these groups of specs could/do use a BeforeAll but are in fact internally parallelizable? (50%? all of them? 7 of them?)
The majority of the specs that use Ordered
only do it to enable using BeforeAll
, probably around 80% of them could be parallelized.
Are you using Ordered containers for all of these today? Or do some of them use a BeforeEach and pay the performance hit of repeated setup and teardown of the same resource for each spec?
In some cases we're using Ordered+BeforeAll, in others we take the hit of Before/AfterEach for every spec.
How important is it that the resources used by a group are cleaned up when all the specs in the group complete? What would happen if cleanup was, instead, deferred to the end of the suite?
This might cause unexpected results in other groups, I think it depends on the specific case. Also, for some it would mean longer time some resources exists could cause quota or cost issues (e.g. cloud resources)
Is part of the problem you are trying to solve improved performance (e.g. are you actually experiencing this: "we are running with N=20 but end up with just five processes running some big Ordered Describes at the end. It would be great if we could get all 20 workers to participate on those specs since we know they're actually parallelizable")?
Long-term, that would be beneficial, though we still have quite a bit of work before we get there. Short-term, this is more about enabling some common setup to be reused for all relevant specs instead of repeated creation+deletion for every spec without requiring them to be ordered.
We can live (and have been) with ordered tests, but it's causing confusion wrt the intent (i.e. the group being marked as ordered despite the order not actually being relevant), and also prevents randomizing the spec order and caused the whole group to fail if one of the specs fail (although this has been solved by ContinueOnFailure
, thanks for adding that!).
We're facing similar problems as described where we currently use k8s.io/kubernetes package in our tests which uses SynchronizedBeforeSuite
, so we need to rely on BeforeAll which makes our tests serial. Having BeforeAll functionality for Unordered containers would certainly be helpful.
In case there's interest, please take a look at the new proposal for managing shared resources and having finer-grained control over parallelism here: https://github.com/onsi/ginkgo/issues/1292