tmt Make the result of an empty plan configurable

When there is no test discovered (and executed) in any of the enabled plans which tmt run detects, it reports 3 as the exit code. This is then handled as an error when executed in Testing Farm. Some users would like to make it configurable how the empty plan scenario is handled.

The way proposed during the last discussion was to introduce a new option under the plan. It could look like this:

/plan:
    discover:
        how: fmf
    execute:
        how: tmt
    empty-plan-result: pass / warn / fail / error / skip

Possibly the new empty-plan-result key could be stored under some of the steps, perhaps execute?

/plan:
    discover:
        how: fmf
    execute:
        how: tmt
        empty-plan-result: pass / warn / fail / error / skip

The thing is that actually we don't have any plan result implemented. There are only results.yaml and based on them the appropriate exit code is reported. So should it be rather something like:

empty-plan-exit-code: 0

Another approach could be that the execute step would create a dummy result with the provided value. This would also suggest that the new option should belong under the execute step. Thoughts?

Related issue: https://issues.redhat.com/browse/TFT-2065

Jun 24 '24 16:06 psss

@fhrdina1, @kkaarreell, @lukaszachy, @thrix, @happz, @Athwale, @psklenar, here's summary from our discussion. Please have a look.

Jun 24 '24 16:06 psss

The thing is that actually we don't have any plan result implemented. There are only results.yaml and based on them the appropriate exit code is reported.

That is slightly inaccurate, I'd say: results.yaml is created per plan, so in this sense, we do have "plan result" - we don't have "plan exit code" because the eventual exit code is decided based on results collected from all plans of a run. And the "plan result" aka "plan exit code" derived from per-plan results never materializes.

This is mostly fine in TF, where tmt run is created for each discovered non-empty plan, and then "all results" are suddenly "plan results". So here a flag would work seamlessly. But then runs with multiple plans arrive on the scene. How about turning the piece of code that decided the exit code (tmt.base.Run.finish()) into something like this:

        results = [
            result
            for plan in self.plans
            for result in plan.execute.results()]
        ...
        raise SystemExit(tmt.result.results_to_exit_code(results))

# could become:
        exit_codes = [
            tmt.result.results_to_exit_code(plan.execute.results())
            for plan in self.plans]
        ...
        raise SystemExit(max(exit_codes))

And then we can change the loop and start involving the plan's flag to override that plan's evaluation of its results. This would materialize "plan result", in the sense of "an exit code a plan would produce if there were no other plans that might make the result even worse".

So should it be rather something like:
empty-plan-exit-code: 0

All in all, both approaches, a flag to turn "empty" into "success" or a flag to turn "empty" into a given exit code, seem fine to me.

What happens when there are two plans in a run, one empty and one with failing tests - the usual "the worst wins" approach?

Jun 24 '24 17:06 happz

I can also see this fitting in the discover definition. I guess there would be some ambiguity when defining multiple deiscover steps.

For reference, ctest uses language like ctest --no-tests={ignore,error}

Jun 24 '24 22:06 LecrisUT

That is slightly inaccurate, I'd say: results.yaml is created per plan, so in this sense, we do have "plan result" - we don't have "plan exit code" because the eventual exit code is decided based on results collected from all plans of a run. And the "plan result" aka "plan exit code" derived from per-plan results never materializes.

I meant that currently we even do not define any final result value like pass, fail, warn for the plan, results.yaml is just a pile of individual test results.

And then we can change the loop and start involving the plan's flag to override that plan's evaluation of its results. This would materialize "plan result", in the sense of "an exit code a plan would produce if there were no other plans that might make the result even worse".

Yes, "counting" the final exit code for each plan makes sense. On the other hand, I'm thinking, whether it would not be more natural for user to state the expected empty plan result as pass, fail... instead of specifying the exit code which usually is not that interesting from the user point of view.

What happens when there are two plans in a run, one empty and one with failing tests - the usual "the worst wins" approach?

If we provide overal result for each plan then we can do this worst-wins math, but for exit codes it's hard to say what is actually "worst". For example, how do we compare what's better, 3 (no test results) or 4 (all tests skipped)?

I can also see this fitting in the discover definition. I guess there would be some ambiguity when defining multiple deiscover steps.

Interesting thought, storing the empty-plan-result flag in the discover phase could also make sense. But do we really need that level of granularity? Any tangible/specific use case on your mind?

Jun 26 '24 08:06 psss

I can also see this fitting in the discover definition. I guess there would be some ambiguity when defining multiple deiscover steps.

Interesting thought, storing the empty-plan-result flag in the discover phase could also make sense. But do we really need that level of granularity? Any tangible/specific use case on your mind?

I can see it go either way:

it could help early exit if the discover phase returns empty so that prepare is not executed

could be a more natural flow when you consider plugins for discover, e.g.:

/plan:
  discover:
    - how: cmake
      no_tests: pass
    - how: fmf
      path: path/to/additional_static_tests
      no_tests: fail

but then if there are multiple discover steps, then each item must define their parameter. https://github.com/teemtee/fmf/issues/228 might be useful for this though. Although, why not have it in both, where execute would treat all discover phase as a whole.

Jun 26 '24 09:06 LecrisUT

On the other hand, I'm thinking, whether it would not be more natural for user to state the expected empty plan result as pass, fail... instead of specifying the exit code which usually is not that interesting from the user point of view.

+1. It would follow the approach used by result in test specification, using words instead of numbers.

What happens when there are two plans in a run, one empty and one with failing tests - the usual "the worst wins" approach?

If we provide overal result for each plan then we can do this worst-wins math, but for exit codes it's hard to say what is actually "worst". For example, how do we compare what's better, 3 (no test results) or 4 (all tests skipped)?

Yep, that's the fun part :) We have 1 exit code and N plans, so eventually, we will need to reduce them somehow into a single number. Maybe both 3 and 4 have the same level of "badness", who knows...

Jun 26 '24 09:06 happz

This issue has been escalated as part of TFT-2065. Not blocking the gating efforts as for them missing test coverage should not be ignored but still useful for some use cases like tier testing. Proposing as a should-have for 1.36.

Aug 09 '24 11:08 psss

moved to 1.43 as requested by @psss , enotime for 1.42 unfortunately

Jan 28 '25 14:01 thrix