spec icon indicating copy to clipboard operation
spec copied to clipboard

Introduction of testCaseEvaluation subject

Open thompson-tomo opened this issue 7 months ago • 11 comments

Looking at the testing events I am having a hard time mapping across to Events I would expect to see from my test projects. It seems that a level is missing.

  • Test Suite: my project of tests which is run
  • Test Case: a test I have written in my test suite.

My difficulty:

  • how can I describe a test case which is called multiple times with different input arguments? Maybe testCaseRun.Id plays a role here?
  • how can I describe a test case which is called multiple times with the same input arguments? Maybe testCaseRun.Id plays a role here?
  • how can I describe a test case which is called multiple times with the same input arguments but runs occur in different config? Maybe testCaseRun.Id plays a role here?

Yes it might be possible on the surface to jam alot of context into testCaseRun.Id however I think it would be more efficient & reliable if we were to introduce testCaseEvaluation subject with the corresponding start/finished event.

These events should include

  • Id
  • source
  • Arguments (dictionary)
  • testcaserun
  • IterationCount
  • InvokationCount

With these additional events we now easily have insight into scenarios where a test case can be invoked many times as a part of a single test case run.

thompson-tomo avatar Jun 08 '25 16:06 thompson-tomo

Looking at the testing events I am having a hard time mapping across to Events I would expect to see from my test projects. It seems that a level is missing.

  • Test Suite: my project of tests which is run
  • Test Case: a test I have written in my test suite.

My difficulty:

  • how can I describe a test case which is called multiple times with different input arguments? Maybe testCaseRun.Id plays a role here?

We don't model test input arguments today. A parameterised test executed with a different set of inputs is, in fact, a different test. It's up to you if you want to track it in your events as a different test or the same one.

  • how can I describe a test case which is called multiple times with the same input arguments?

Each testCaseRun includes the testCase object. A testCase corresponds to your test definition, a testCaseRun corresponds to one execution of your testCase. You may have as many execution of the same testCase as you'd like, and they may belong to the same testSuiteRun.

Note that a testCase object does not have events associated with it. Events for a testCase could be something like defined, updated, deleted`... however so far we found no one interested in tracking the lifecycle of the test definitions themselves, usually what people are interested to track is the executions of the test cases.

Maybe testCaseRun.Id plays a role here?

The id of the testCaseRun identifies a specific execution of a test case. It must be unique within your source of test events, i.e. if the same test is executed twice, it will generate testCaseRun events with different IDs.

  • how can I describe a test case which is called multiple times with the same input arguments but runs occur in different config?

Assuming "different config" means different target environments (or the same one configured differently, which at the end can be seen as a different one), the answer is that testCaseRun includes an environment field that you can use to specify which environment the run was executed against.

Maybe testCaseRun.Id plays a role here?

Yes it might be possible on the surface to jam alot of context into testCaseRun.Id however I think it would be more efficient & reliable if we were to introduce testCaseEvaluation subject with the corresponding start/finished event.

These events should include

  • Id
  • source
  • Arguments (dictionary)
  • testcaserun
  • IterationCount
  • InvokationCount

With these additional events we now easily have insight into scenarios where a test case can be invoked many times as a part of a single test case run.

In the current model, a testCaseRun is the execution of a single test. If you want to execute the same test multiple times, you can express this via events with multiple testCaseRuns. If you want to group your testCaseRuns you can use the same testSuiteRun for all of them.

If you want to have more levels of grouping, you could model nested testSuites, and generate events as follows:

-> parent testSuiteRun start -> child 1 testSuiteRun start -> test 1... N start/end -> child 1 testSuiteRun end -> child 2 testSuiteRun start -> test 1... N start/end -> child 2 testSuiteRun end -> parent testSuiteRun end

To better express the parent/child relationship on CDEvents side, you could either use links, a chain ID, or we could introduce a partentTestSuiteRun field in the testSuiteRun events.

afrittoli avatar Jun 09 '25 14:06 afrittoli

Thanks for the reply @afrittoli it seems that there is ambiguity in what different terms mean which won't be helping the scenario. It includes by me as well.

A parameterised test executed with a different set of inputs is, in fact, a different test.

Note that appears to differs from nunit/junit where the definition of a test is the method being called.

Howabout the following as a solution:

  • Test cases gain an optional tests object. This enables all test cases & by extension all test case runs to be associated with the test that was written.
  • Test cases gain the ability to specify a dictionary of parameters which describe the conditions provided to the test
  • Test case run gain both an IterationCount & an InvokationCount property.

We also work on the definition of the common objects to include examples ie

  • Test suite: a collection of test cases
  • Test case: the inputs which are being used against the specified test
  • Test: a method or functionality which is being checked

With this no additional events are being created, the scope of a test case is being cleared up and we are catering for test cases which are repeated by design.

thompson-tomo avatar Jun 10 '25 00:06 thompson-tomo

@thompson-tomo The Testkube team introduced IIRC, the TestCaseRun & TestSuiteRun. In Testkube, the granularity is different. I mean a testcase != a unit testcase (a function), but a run of tests (like make test or a postman collection suite, ...). And test suite helps to group tests of various nature "unit, then integration," like a suite of blocks in your CI tool. see https://docs.testkube.io/articles/defining-tests

To have the granular result of tests, you have to look at the report/log of the test execution.

@afrittoli , maybe the granularity should be clarify in the specs?

davidB avatar Jun 10 '25 15:06 davidB

@davidB it is interesting to read/see that testkube have deprecated/eol the concept of test cases/suites as per testkube

Looking at the postman docs, I see the collection as your test suite which is what postman is showing via

Create test suites

Organizing your requests into Postman Collections enables you to run and automate a series of requests.

I think a key thing should be what is the events being used for. In this case I would say the purpose of the testing events is to enable platforms to transfer the status of tests and either visualise this or to possibly trigger the raising of incident tickets for the failing test to be investigated.

thompson-tomo avatar Jun 11 '25 04:06 thompson-tomo

I agree; the key thing is the possible usage of the events. The granularity by test/test case is too small, IMO. I can't imagine a "reaction" for each failing or successful test (reporting a ticket per failing test will generate noise, and often during investigation, having information about other tests is helpful).

So maybe, we should remove the TestCaseRun event and only keep TestSuiteRun (maybe with some additional information: SUT, link to details & result, summary):

  • To be able to use it to trigger the next step, for example:
    • On success: next test suite, publishing, deployment, promotion, ...
    • On failure: raise an alert, open a (or more) ticket (with complementary information, capture logs, ...)
  • To have metrics (date+time, number of success / failure / skip / disabled / ...)
  • To let a dedicated system manage details (per test result, log,...)

davidB avatar Jun 16 '25 07:06 davidB

I believe there is value in having a hierarchy of some kind, and the current TestCase/TestSuite serves that purpose. If there is need to have a deeper hierarchy, we can make it possible to have nested TestSuites.

Test events can be used to build evidence needed to trigger other steps in the workflow or for auditing purposes - how this information is structured may vary depending on the system under test and test cases. A go or python based microservice may have hundreds of unit tests, and we're unlikely to send events for each of them - there are other tools better suited for that level of granularity. However unit tests, linting, type checks, and integration tests could be seen as individual tests as part of the "CI test suite". You could have a security test suite etc and you could have rules based on which test passes out of the various test suites.

Tests may look very different from the python unit tests, some systems require manual tests, and some type of tests may require hours or days to perform. Some tests may be performed in production over several days, for example, by enabling feature X for 10% of our users and tracking the results.

afrittoli avatar Jun 16 '25 08:06 afrittoli

chiming in with @afrittoli - this was originally modeled for any kind of test - not just unit-tests.. so for example a performance test with k6 could be modeled as a single testcaserun - while it could be included in a testsuiterun for a complete system test that also runs security tests, e2e tests, infrastructure tests, etc..

olensmar avatar Jun 16 '25 08:06 olensmar

I agree with @afrittoli about needing some hierachy and mantaining the current level of events. However the idea of nesting suites seems counter intuitive to me. This is what led to the new proposal of optionally adding in the definition of the test into the test case.

I also agree that a CI/Code quality test suite could exist which runs multiple linting/quality checks.

I think a key part of our definitions should be that:

  • Test case: a collection of checks which produces a pass/fail/skipped result based upon inputs.
  • test suite: a collection of test cases which are to be performed.

I agree that an incident ticket might not be created however it could produce a list of failed tests in the ci/CD tool.

thompson-tomo avatar Jun 16 '25 10:06 thompson-tomo

For me, it's the opposite. What is confusing is to say a test case is a collection. TestSuite is a collection with one or n tests (related to unit, performance, lint, ...) or maybe to other collections. IMHO, TestSuite aggregates/summarizes the results of "children" to allow the consumers to process it. Aggregating at the consumer level is a complex pain (need more information,...).

We seem to agree on the usage/reaction for those events and that the "test/rule" granularity is too low. I'm in favor of NOT overriding the usual definition of "test" and "test case", which is why I suggested removing the "TestCaseRun" event.

Having a hierarchy introduces complexity (for automatic consumers). When it's only a two-level hierarchy (issue & sub-issue, channel & thread, ...), it often introduces frustration in usage or confusion about "what is the right granularity?", like here, so more convention (per tenant, team, company), not something good for interoperability. Maybe in the future, we could support TestSuite as a composable, and if each level provides an aggregation of direct children, it should not be a problem.

davidB avatar Jun 16 '25 13:06 davidB

only having testsuite-related events doesn't make sense to me though - i would expect to know something about the parts of a suite if that concept is given.. one option could be to just rename testcaseX events to just testX, i.e. testcaserun -> testrun, etc.. which is a more generic naming that doesn't associate as closely with actual testing tools!?

I agree that having an hierarchy introduces complexity, but I also think that most modern (testing) pipelines are ultimately complex (or evolve to be over time..), so having the possibility to model some kind of relationship between tests and their composition is valuable - we could be more clear on how we envision these concepts being mapped to tests and their orchestration!?

olensmar avatar Jun 17 '25 08:06 olensmar

Let me explain the below with an example

Test case: a collection of checks which produces a pass/fail/skipped result based upon inputs.

These checks could be the the individual assert statements in the unit test case when a suite is a project or if the project is a test case then it would be all the unit tests. For a linting test, the checks would correspond to each file which is being checked.

In both cases I see the additional info about the test being done as useful.

thompson-tomo avatar Jun 18 '25 11:06 thompson-tomo