Catch2
Catch2 copied to clipboard
Add new SKIP macro for skipping tests at runtime
This adds a new SKIP
macro for dynamically skipping tests at runtime. The "skipped" status of a test case is treated as a first-class citizen, like "succeeded" or "failed", and is reported with a new color on the console.
This is useful when the conditions for skipping a test are only known at runtime. For example:
TEST_CASE( "some feature does something on GPUs" ) {
if ( get_gpu_count() == 0 ) {
SKIP( "no GPUs are available" );
}
CHECK( do_something_on_gpu() );
}
The output will then look like this:
Like other non-assertion type macros (such as INFO
and WARN
), skipping is still implemented as an "assertion" internally. I've added a new result type ExplicitSkip
(which is currently not treated as a failure) as well as a new skipped
counter in Catch::Totals
. While this initial implementation was surprisingly simple, I've marked the PR as a draft as there are still several outstanding design questions.
For example, the result type is called an explicit skip because there already exists the notion of a "skipped" test in the current codebase. In fact, there are at least 3 conditions under which tests could be considered as skipped:
- Tests that are hidden (
[.]
) - Tests that are not matched by the provided filter
- Remaining tests that are not executed after the
--abortx
limit has been reached
The last one is currently explicitly referred to as "skipped" within the IStreamingReporter::skipTest
hook. As far as I can tell, only the Automake reporter is actually handling this one though. I'm wondering whether there is an opportunity here for unifying (some) of these concepts in both how they are handled internally as well as the way they are being reported.
Open TODOs:
- [ ] Determine whether/how the notion of "skipping" should be unified within the codebase and in reporting (see above)
- [x] How should skipping some sections / generators affect the overall result of a test case?
- [x] Does having a skipped test mean a non-zero exit code?
- [ ] Add hook for reporters to handle explicitly skipped tests (figure out what to do with existing
IStreamingReporter::skipTest
) - [ ] Properly handle skipped tests in the various built-in reporters
- [ ] Consider adding a CLI option for whether skipped tests should be reported or not (similar to
-s
) - [x] Should "skipped assertions" even be reported? The fact that these even exist is a result of how test results are communicated throughout the system (i.e., a "skipped assertion" only leads to a skipped test in
Catch::Totals::delta
), but the terminology feels a bit weird. - [ ] Add documentation
Resolves #395.
Codecov Report
Merging #2360 (6309b3c) into devel (52066db) will increase coverage by
0.01%
. The diff coverage is90.22%
.
@@ Coverage Diff @@
## devel #2360 +/- ##
==========================================
+ Coverage 91.13% 91.15% +0.01%
==========================================
Files 187 187
Lines 7634 7691 +57
==========================================
+ Hits 6957 7010 +53
- Misses 677 681 +4
This is going to take awhile to review, I don't think I will really get to this before next weekend.
I think that skipped tests should not count as failures, nor should they
count as successes. Skipping a test case from within the TEST_CASE
should
behave similarly to not running the test at all.
This has some consequences, such as if a test run only invokes skipped tests, then it should fail (because Catch2 fails test run that runs no test cases), but as long as there is at least one test case that finished, the return code should be 0.
However, it might be useful to add a warning (-w
) option that would cause
the test run to fail if any tests were skipped, as essentially an "audit"
option.
From this also follows what should happen when a section/generator input is skipped: it should look like that section/generator input does not exist. At least as much as possible, if the test case looks like this:
TEST_CASE() {
CHECK(1 == 2);
SKIP();
REQUIRE(1 == 2);
}
then that test case should be counted as failed.
Further examples:
TEST_CASE() {
SECTION("A") { std::cout << "A\n"; }
SECTION("B") { SKIP(); }
SECTION("C") { std::cout << "C\n"; }
}
Should print out "A\n" and "C\n", and similarly
TEST_CASE() {
SECTION("A") { std::cout << "A"; }
SECTION("B") {
SECTION("B1") { std::cout << "B1"; }
SECTION("B2") { SKIP(); }
}
std::cout << "\n";
}
should print out "A\n" and "B1\n", and nothing more. (Generally exits from the last section in test case are tricky, due to how sections are discovered and tracked.)
As to the other parts, "skipped assertions" should definitely not be reported as assertion. However, their test cases should be reported as skipped. I don't believe we need a flag to show them, as they should be shown by default in most reporters. I am willing to change my mind on this though.
Also I have no idea what to do about the existing skipTest
uses -- I am not
sure anything actually uses it for anything useful (I am not entirely sold
on --abort
being useful either).
If the section tracking proves trouble some, I am willing to say that test skips can only happen at the top of a TEST_CASE
, but I would rather not, or at least allow
TEST_CASE() {
auto i = GENERATE( ... );
if (i % 2 == 1) { SKIP(); }
}
Thank you, I will need some time to investigate how to best address all your points.
Also I have no idea what to do about the existing skipTest uses -- I am not sure anything actually uses it for anything useful (I am not entirely sold on --abort being useful either).
Would you in principle be open to repurposing that hook for the SKIP
macro in a breaking change?
Would you in principle be open to repurposing that hook for the SKIP macro in a breaking change?
Yes, but it would also likely need to be renamed and further changed, as the SKIP granularity is at section level, not test case level.
Sorry for the delay. After looking back into this now, it seems like not too many things need changing after all.
I've rebased onto current devel
and removed the printing of "skipped assertions" from the console reporter.
Skipping Behavior
I think the initial version of this PR already mostly did what you expected in your examples. I've added a few more test cases to better cover these scenarios.
and similarly
TEST_CASE() { SECTION("A") { std::cout << "A"; } SECTION("B") { SECTION("B1") { std::cout << "B1"; } SECTION("B2") { SKIP(); } } std::cout << "\n"; }
should print out "A\n" and "B1\n", and nothing more. (Generally exits from the last section in test case are tricky, due to how sections are discovered and tracked.)
Not sure I'm following. If there were a FAIL
instead of the SKIP
in the last section, it would print A\n
, B1\n
, \n
, right? Because as you say, exiting the last section is tricky. Why would the behavior for SKIP
have to be different, then? Without having looked into it in detail, it seems like treating this differently would require a lot of work for arguably little gain, IMO.
If the section tracking proves trouble some,
Again I'm not sure I understand why I would need to do section tracking manually; by treating SKIP
similarly to REQUIRE
or FAIL
(i.e., ResultDisposition::Normal
), I don't need any custom section tracking and basically get the behavior you show in your examples for free.
Exit Code
I've adjusted things to return a non-zero exit code (4) when tests were run, but all of them ended up being skipped. Do we want something analogous to --allow-running-no-tests
here? E.g., --allow-skipping-all-tests
.
I've also added a new test case for this behavior, however I'm not sure whether "ExtraTests" is the right location for it. Also I have no idea what the whole X#-<name>.cpp
naming scheme is about. I basically just did whatever the EmptySpecWithNoTestsFails
test was doing.
Hooks
Also I have no idea what to do about the existing skipTest uses -- I am not sure anything actually uses it for anything useful (I am not entirely sold on --abort being useful either).
Turns out I don't actually have a use case for a hook when SKIP
ing a test, so I'd rather wait for one to come up before I unnecessarily pollute the API. At the same time we can't keep the existing skipTest
hook, as it would be extremely confusing. So my current thinking would be to just remove it. Alternatively we could rename it to abortTest
or something.
Iād be happy to see this feature in Catch2 at some point in time. I have a similar use case, namely skipping some test cases at runtime if certain ressources (in my case some attached sensors) are not availabe.
I am finally back around for long enough to go through the larger PRs.
I rebased the PR and will pick it up during this week. Sorry for the long wait š
Thanks for looking into this again, let me know if you need anything!
@psalz I read through it again today, I think the changes are fine, but the other reporters need to have a better support for skipping.
I tried a few, and the output is not very useful, e.g.
$ ./tests/SelfTest "tests can be skipped dynamically at runtime" -r tap
# rng-seed: 3838586559
# tests can be skipped dynamically at runtime
** internal error ** 1 -
1..1
internal error
as skip message is no good.
$ ./tests/SelfTest "tests can be skipped dynamically at runtime" -r compact
Filters: tests can be skipped dynamically at runtime
RNG seed: 2657692748
Passed all 0 test cases with 0 assertions.
The 0 passed, all green output is confusing, especially when combined with the return code being 4 (because all selected tests were skipped).
$ ./tests/SelfTest "tests can be skipped dynamically at runtime" -r xml
<?xml version="1.0" encoding="UTF-8"?>
<Catch2TestRun name="SelfTest" rng-seed="371487568" catch2-version="3.1.0" filters="tests can be skipped dynamically at runtime">
<TestCase name="tests can be skipped dynamically at runtime" tags="[skipping]" filename="/mnt/c/ubuntu/Catch2-psalz/tests/SelfTest/UsageTests/Skip.tests.cpp" line="13">
<OverallResult success="true"/>
</TestCase>
<OverallResults successes="0" failures="0" expectedFailures="0"/>
<OverallResultsCases successes="0" failures="0" expectedFailures="0"/>
</Catch2TestRun>
XML reporter should provide all available information, because it implements Catch2-specific output format to be used for machine parsing of the output.
$ ./tests/SelfTest "tests can be skipped dynamically at runtime" -r Automake
:test-result: XFAIL tests can be skipped dynamically at runtime
I am not sure if this is the best we can do, see below.
In total, there are 8 built-in reporters. 3 of them are Catch2-specific
- [x] console - I think this one is fine as-is, except for better colour choices. I will work on that later, because I also need to change how some of the colours are defined.
- [ ] compact - We can definitely do better here. The compact reporter output has another related issue (#878) that could be tackled at the same time (but you don't have to).
- [ ] XML - The overall XML structure cannot change, but we can extend it trivially. The goal of this reporter is to make the test results machine-parseable, so they need to mention skipping as well.
And 5 implement specific format we do not control. ~~Those might be an issue, because the output format might not support the concept of skipping tests.~~ Nope, they all support skipped tests.
- [ ] Automake -
SKIP
is one of the acceptable test case outcomes - [ ] JUnit - some googling suggests that a test case can have
skipped
as the result, but I could not find a proper authoritative source - [ ] SonarQube - a test case can have
skipped
as the result - [ ] TAP - the
# skip
directive works for this - [ ] TeamCity - The
testIgnored
message seems to match up with the intent
I also went to the original post (this is a very high quality PR btw) and
Consider adding a CLI option for whether skipped tests should be reported or not (similar to
-s
)
I don't think this is useful. While skips are not counted as errors, I think we should report them by default, as they are an indicator of potential issue. E.g. if the dynamic check for whether the test should be skipped is wrong, the user might end up with a test that never actually runs.
OTOH, there might be some use for an option that turns skipped tests into failures, instead of passes, for similar reasons.
I started looking at the reporter formats.
TeamCity - I think the Ignored tests message matches up with the intent of SKIP
, see https://www.jetbrains.com/help/teamcity/service-messages.html#Nested+Test+Reporting, Ctrl+F to "Ignored tests".
Went through all the 3rd party format reporters, and I am pleasantly surprised that all of them support concept of reporting skipped tests. See my previous post with list of reporters for notes.
re the flag to turn SKIP
s into failures, I thought about it and I think a new warning flag is the simplest (and cleanest) solution
Rebased onto #2554. Apologies for being a bit slow on this right now, I will try to find some time to look into the other reporters.
Don't worry, you will have to be waaaay slower before the PR has spent more time waiting on you than it has waiting on me.
Do you have an idea about when you are going to have time to finish this? I am planning next release and wondering whether I should wait for this or not.
Tough to say, I have a pretty busy month ahead unfortunately. Maybe I can find some time next weekend. I briefly looked into it yesterday and ran into a couple of issues where I need clarification:
- As mentioned above, the Automake reporter currently uses the
SKIP
result to report test cases that were not run after reaching the--abortx
limit. Other reporters don't use it but there are some comments talking about how this should be implemented. I propose to instead, in a breaking change, get rid of theIEventListener::skipTest
hook. If a test run aborts prematurely, any remaining tests are simply not included in the report. - The JUnit, SonarQube and TeamCity reporters currently all use their "skipped" result type to report
!mayfail
s. Since there is presumably no better option to express this, we'll have to use it for both!mayfail
and explicit skips going forward. - While it doesn't seem to be too difficult to implement skipped tests into most of these reporters, how do I know whether the produced files are still valid? Is there some kind of infrastructure in place to validate that the format is actually correct? Or do I have to set up SonarQube and TeamCity installs etc...? (I'd rather not). I find it pretty ironic by the way that none of these platforms with their XML results don't seem to make use of the only redeeming quality of this awful format in that they provide a schema for validation.
No problem, I'll pull in things I want to have in the next release and plan for this to be in a later one.
- Sure,
skipTest
can be marked deprecated, and eventually removed. The removal will take a while though, as I don't plan to have another breaking release soon. - Yes. Honestly
[!mayfail]
is a weird attribute, because the test code runs, but the result is a failure only in the case of hard-crash. This then isn't semantically supported by many formats. - My approach to testing these is that if I can't find a validator with reasonable amount of effort, I just compare the output with the spec (if I can even find it, JUnit is terrible about this), and if it looks right, merge it and wait to see if someone complains. š¤·
Okay, back with some time to work on this! I've rebased onto current devel and added support for skipped tests into all reporters. Here's some notes on each:
-
Compact The compact reporter now also prints a line for each call to
SKIP
. -
XML I've added a new boolean attribute
skipped
to each section's<OverallResults>
as well as askips
count to each test cases'<OverallResult>
. Note that a section that fails e.g. aCHECK
and thenSKIP
s is counted as both failed and skipped. The top-level overall results now also report askips
count. Other than that, I've introduced a new<Skip>
node, analogous toFailure
etc. -
Automake Automake has support for
SKIP
results, adding that was simple. -
JUnit I found two different schemas for the JUnit format online. According to this one, there's only one
<skipped>
/<failure>
/<error>
allowed per<testcase>
. However, since the current reporter produces multiple of them, I assume it instead follows this schema, which allows any number of<failure>
and<error>
. Unfortunately that one only allows at most one<skipped>
, which is a problem if a test case is both skipped and marked as!mayfail
. I've decided to not handle this explicitly, instead producing two<skipped>
nodes - let's see if someone complains. The same is true for test cases that skip multiple times for different generator values. See for example the test case"dynamic skipping works with generators"
. The test case"failing in some unskipped sections causes entire test case to fail"
behaves differently for the JUnit reporter, since individual sections are reported as separate test cases, so in this case we'll have one skipped and one failing "test". Btw, there seems to be an inconsistency here, because tests with generators still cause the "tests" count on<testsuite>
to be incremented by one for each value, but only a single<testcase>
is written. -
SonarQube As mentioned before, SonarQube doesn't provide a proper schema for their format -- only an example -- and from what I can tell, the existing reporter is not conformant to that. No idea if there can be multiple
<skipped>
nodes per<testCase>
, but analogous to JUnit, that's what we have now. -
TAP TAP simply prints a list of all assertions, so for
SKIP
s I'm printing anok
result followed by the# SKIP
directive. -
TeamCity I'm using the
testIgnored
message, as suggested. I'm not sure however what it means when a test both has atestFailed
andtestIgnored
message, i.e., whether the test will be considered as failed or ignored in that case. I'm currently outputting both when a test case has e.g. a failingCHECK
followed by aSKIP
(for example in"failed assertions before SKIP cause test case to fail"
).
Nice, just yesterday I was thinking that when I find some time, I should finish this PR.
Some preliminary notes
- For TeamCity, I don't see any of the
testIgnored
messages in the approval tests output. I have no idea what happens when a test both fails and is skipped either, but I think this can be solved later. - JUnit: There is already an open issue about multiple fails reported if there are multiple failing
CHECK
s in a test case. My answer for that was "I might fix it, but useREQUIRE
s in the meantime if you need it", and I think this is the correct answer here as well. - ~~XML reporter: the skipped attribute does not get written as bool, due to
ScopedElement
handling this badly. I'll fix it in the main branch and you can pick/rebase the commit here.~~ Nevermind on this one, I misread the approval tests.
For TeamCity, I don't see any of the
testIgnored
messages in the approval tests output. I have no idea what happens when a test both fails and is skipped either, but I think this can be solved later.
Right, it seems that's due to the approval test script being a bit too overzealous in stripping out file paths though. You'll notice that the existing TeamCity baselines already don't include any testIgnored
lines, or testFailed
, for that matter.
Hmmm, you are right. Should be fixed on devel now š
I looked at the SonarQube reporter. Do you think the current reporter does not obey the example outside the extended result tag?
It was originally contributed by someone working with SonarQube, so I assume that works š¤·
Otherwise I think this is done after a rebase for the new approvals. (nevermind, docs need to be done first)
I've rebased onto current devel
and added some documentation. I've also deprecated IEventListener::skipTest
by adding a comment to the function, as well as adding it to docs/deprecations.md
. Should it also be deprecated in code, e.g. through a @deprecated
comment, or even a [[deprecated]]
attribute?
I've also reformatted Skip.tests.cpp
to be in accordance with the new clang-format
settings. I did not update the other code changes (e.g. to reporters), as the corresponding files all still seem to better match the old style.
I looked at the SonarQube reporter. Do you think the current reporter does not obey the example outside the extended result tag?
No I think then it's fine, I was referring to the extended result tag.
I think this is done. I wrote some follow ups into my TODO list, e.g.
Should it also be deprecated in code, e.g. through a @deprecated comment, or even a [[deprecated]] attribute?
Yes, but it needs to be configurable, so the deprecation can be disabled via preprocessor, letting the users deal with this when they can. I'd prefer that to be done later, especially since there is more stuff in Catch2 that will get this treatment.
Do you want to clean up the commits, or are you fine with me squashing the PR?
Fine with me, squash away! :)