ginkgo
ginkgo copied to clipboard
Custom function for focused specs
Feature Request: Is there a way to pass a custom function for determining focused specs over just the regex filter? We'd like to run a custom set of specs depending on certain parameters, and regex matching isn't sophisticated enough for those conditions. I noticed that there is the programmatic focus, but that seems to only be determined at compile time rather than runtime.
There is programatic Skip()
, for instance:
BeforeEach(func() {
if !precondition() {
Skip("precondition not met")
}
}
Is that any good?
I apologize in advance for the novella, but I wanted to provide some context
The e2e tests in kubernetes/kubernetes use ginkgo and have evolved organically over time. I think the kubernetes project has generally struggled with how to specify and use test metadata. The solution that is used most pervasively throughout the codebase is to use well-known tags within the test name (either at the context or individual It level).
Tags are used primarily in two ways:
- Categorize the qualities of the test (e.g.
[Slow]
,[Serial]
,[Disruptive]
,[Flaky]
) - Categorize the suite of the test (e.g.
[Feature:Foo]
to test some feature that isn't usually enabled by default,[Conformance]
to be included in the conformance suite,[sig-node]
to be included in the suite of all tests that the “sig-node” group is responsible for, etc.)
e.g. [k8s.io] [sig-node] NoExecuteTaintManager Single Pod [Serial] removing taint cancels eviction [Disruptive] [Conformance]
I have come to refer to this solution as “stringly-typed” test names.
In practice this means:
- updates to test metadata are equivalent to renaming the tests. Our supporting test infrastructure doesn't understand the concept of tests getting renamed, so we lose history anytime we change metadata
- test names become increasingly illegible as more metadata is added, so we are discouraged from adding arbitrary metadata
- the regexes used for focus/skip can get pretty gnarly (but we haven’t hit a length limit thus far)
- there are always skipped tests, which leads to uncertainty over whether we ran everything we expected to (e.g. auto-detection vs. improperly configured system under test)
We’ve lived with this so far, but a new requirement is making us reconsider this approach: the ability to categorize a test (a single It
block) as covering up to N different behaviors. Our goal is to be able to take a set of behaviors and a set of tests and compute things like:
- Which behaviors are covered by a given set of tests
- Which set of tests to execute to exercise a given set of behaviors
As I said above, we’re using ginkgo for e2e/integration style tests. For example, an e2e test may take the approach of creating a resource in a kubernetes cluster, updating it N different ways, and deleting it all within a single It
block. A number of behaviors are exercised throughout the flow of this test, but the description in the It
block can’t adequately express this. Ginkgo documentation suggests using By
blocks for this, but I can’t tell that they do anything more than log, so they don’t provide info to us prior to running tests.
At present, I’m not entirely sure how to make this new requirement fit with the existing corpus of tests we have, and ginkgo as it’s currently written. It’s making me question the validity of the requirement and the validity of how we’re using ginkgo.
I think the “Skip in BeforeEach” approach doesn’t work for us because we would want to be able to skip/focus depending on which It
we’re about to call.
Some options we have considered:
- Embed the list of behaviors in the
It
block as another set of tags. This will push test names well past the point of legibility. - Embed the list of behaviors in structured comments above our
It
blocks, generate a metadata file from those comments at build time that associates to the full test Contexts+It test name, and generate ginkgo.skip/focus regexes at runtime based on the metadata. This is a rube-goldberg contraption that will further complicate how we run e2e tests. - Wrap the ginkgo DSL with functions that take a metadata struct (including behaviors), and defer calling the gingko DSL until runtime, after metadata criteria have been evaluated. This requires touching a lot of code, and we lose line info.
None of these approaches seem great, though FWIW it looks like we're headed towards 2. Any suggestions?
hey @spiffxp thanks for the detailed description. I know this has been a long-standing issue for kubernetes and I can't imagine the amount of tech debt it's generated. I'd like to invest some time designing a more ideal interface to help with this categorization/organization problem. What would it look like to get some time (I'm thinking ~an hour on zoom sometime in the next week - I think a synchronous conversation would be super productive) with a group of k8s maintainers who experience this pain most acutely (e.g. is there a particular sig that owns e2e) so we can better characterize the problem - and then consider options for (potentially deeper) changes to Ginkgo that would help?
I expect first class support for tags is the answer but I want to make sure I understand the problem more deeply before proposing a solution. In particular, it sounds like y'all would benefit from having a lot of programatic support for manipulating the set of tags and filters vs ginkgo
trying to do too much on the command-line?
For the immediate term - one thing to note is that CurrentGinkgoTestDescription()
(which you can call in a BeforeEach
) includes the FullTestText
which will have all the strings you need to make a runtime Skip()
decision.
Longer term, it might be valuable for me to take a look at the e2e tests (which, I assume, are gargantuan at this point) and see if there are other patterns/best practices that Ginkgo could provide to help.
Thoughts? Let me know if there's a related issue over in the k8s repos that this conversation would make sense in.
gentle bump on this @spiffxp - any thoughts on a deeper dive?
Apologies for not responding sooner. This drowned in github notifications.
First class for tags sounds like the answer to me. There are some other gripes with ginkgo I could solicit if you're interested, but they may be tangled up with the organic blob of code that is our e2e testing "framework"
Kubernetes SIG Testing has a subproject called "testing commons" which is where most people interested in the e2e framework show up, their next meeting would be Friday Jul 10 @ 1pm PT. If you're interested in showing up there I can let people know. If that doesn't work my e-mail is in my github profile and we can try to sort something out.
I'm going through old issues and noting that a beta for Ginkgo V2 was just released. It includes support for test labels which y'all can try out.