rackunit
rackunit copied to clipboard
test failures not detected when using nested test cases
It seems that "check" failures inside a nested test suite is not considered a "test" failure: when running the code below, note the (check = 0 1)
, the failure is reported to the console, but the test passes:
#lang racket
(require rackunit rackunit/text-ui)
(define ts1
(test-suite
"ts1 test-suite"
(test-case "outer test case"
(test-case "inner test case"
(check = 1 0)))))
(module+ test
(run-tests ts1))
When running the code with `raco test sample.rkt", I get:
raco test: (submod "sample.rkt" test)
--------------------
ts1 test-suite > inner test case
FAILURE
name: check
location: simple-test.rkt:9:7
params: '(#<procedure:=> 1 0)
--------------------
1 success(es) 0 failure(s) 0 error(s) 1 test(s) run
0
1 test passed
So the check is failing, but the result shows that the test passed and the "raco test" process returns 0 (success).
What should the semantics be here? I agree that this is not great, but it's not clear what we want to happen.
In the example above a check failed, yet the summary line indicates that there were zero failures and the exit code from the raco test program is 0, indicating success. What I would like to happen instead is:
- the summary line should be "0 success(es) 1 failure(s) 0 error(s) 1 test(s) run" and "0 tests passed" -- when there is lots of output from the test, I tend to just look at the summary line to check if all tests passed
- the return code from the raco test should be non-zero, so, a continuous integration pipeline will fail if tests fail.
Alternatively, nested test cases should trigger a compilation or runtime error, as it is an easy mistake to write nested test cases, and their behavior is surprising, at least.
(Reposted from #136 to keep discussion in one place)
This does seem related to "nested" test cases but there's more to it. Some observations using the example in the description [in #136 ]:
- If we replace
test-equal?
with the equivalentcheck-equal?
form, it reports the fail count correctly.
(define tests
(test-suite
"Top"
(test-case "Test case"
(check-equal? "A"
"B" "A is equal? to B"))))
=> 0 success(es) 1 failure(s) 0 error(s) 1 test(s) run
- If we remove the
test-case
altogether, and replace the test with a simplecheck-equal?
form, it once again reports the count incorrectly:
(define tests
(test-suite
"Top"
(check-equal? "A"
"B" "A is equal? to B")))
=> 1 success(es) 0 failure(s) 0 error(s) 1 test(s) run
Personally, I didn't even know that test-equal?
existed, which is a shortcut for (test-case ... (check-equal? ... ))
, and have been using the second form (i.e. check-*
within test-suite
) in my tests extensively, in addition to using test-case
.
Note that even though the check doesn't count as failed in the summary, it does still count as one test. Having n
checks results in n test(s) run
. I think we'd agree that it should either count as a test and also count for success/failure or it should count for neither, but not count for one and not the other. So some possible ways in which it should work are:
-
check-*
forms that aren't contained in atest-case
-- whether they are enclosed in atest-suite
or not -- should be counted individually, i.e. as test cases, for success or failure.test-case
is an optional wrapper to add additional context. This would avoid the need to rewrite existing top level checks to wrap them intest-case*
forms in order to have them count. The other possibility is to disallow checks that aren't enclosed in some form oftest-case
form. From my vantage point, I'd prefer the former, especially since it preserves backwards compatibility. -
Re: nested forms, I actually really like the flexibility of being able to nest forms with abandon. But it seems that rackunit is somewhere between two versions of itself here. On the one hand, if we allow arbitrary nesting, why not just have a single test form like
test-case
(in addition to thecheck-
forms) which can be nested to arbitrary depth, and the counting can just be all the terminals, i.e. checks. In this solution,test-case
means "the list of tests", whose members can either be test-cases or checks (providing the base case). On the other hand, if we have bothtest-suite
andtest-case
to capture "the list of tests" and "a specific test", respectively, we could say that sincetest-suite
s can already be nested, that nesting atest-case
isn't necessary -- we could just replace any sequence of (test-case ... (test-case ...)) with (test-suite ... (test-case ...)), with only the terminal being atest-case
and all the enclosing ones beingtest-suite
s. In this case, rackunit could signal a compile error if a test-case-within-a-test-case is encountered, as the author suggested in #98 , or it could just treat the whole thing as one test, even if arbitrarily nested.
On point 2, I think I'd lean towards coalescing test-case
and test-suite
to be just one form, maybe aliasing one to the other, so that they can be arbitrarily nested. It might also be easier to define higher-level operations on tests this way, e.g. composing two tests to create a union test, for whatever that may be worth, or extracting the component of tests that failed as a fresh test programmatically. Not sure if this could work, but it might enable a way to programmatically iterate a test-fix-rerun loop. But since the source identifiers are statically bound, I'm not sure if this would be useful unless we could dynamically load those identifiers to re-run an evolving set of tests. Now this is just crazy talk 😄
For what is worth, I wrote my own test runner package, al2-test-runner, which does not have the problem I reported here, so I am no longer affected by this issue. My package also does not have a problem with the issue reported in #136 either.
If I understand correctly, the rackunit/text-ui
cannot be modified, so it might not be able to fix this issue...
In case you want to use al2-test-runner
, all you need to do is replace rackunit/text-ui
require with al2-test-runner
, rest if the tests and test cases don't need to change.
Realistically we can't change test-suite
to be an alias of test-case
because of backwards compatibility.
Honestly my take is that test suites were a mistake and shouldn't have existed. My preference would be to move all of the test suite related functionality into a separate section of the docs with "Legacy API" in the heading.
Deprecate test-suite
.
Why is test-suite
a mistake? Why should it be deprecated?
FWIW, it's the only facility that allows "setup" and "teardown", which are really useful. If it's to be deprecated, there should be something else to replace it that implements these features.
Would something like this be possible?
- check-* evalutes to a test result, as it currently does
- test-case comprises either test-cases or checks
- test-case evaluates to a value representing a test (it does not run the test)
- test-case supports test-suite's #:before and #:after
- run-tests accepts a test value defined using test-case
The problem that I reported in this issue is a problem with run-tests
, and not a problem with test-suite
or test-case
. I know this because I implemented my own version of run-tests
which runs the same test suites and does not have this problem. Also rackunit
's GUI test runner, will run the same test suite and correctly report the failure:
#lang racket
(require rackunit rackunit/gui)
(define ts1
(test-suite
"ts1 test-suite"
(test-case "outer test case"
(test-case "inner test case"
(check = 1 0)))))
(module+ test
(test/gui ts1))
There may be other problems with rackunit, which I don't fully understand, and perhaps there a are good reasons for removing test-suite
, and perhaps removing test-suite
will also inadvertently fix this problem.
However, I think that the suggestions made by @countvajhula and @jackfirth will not directly address the issue I reported and perhaps proposals to remove test-suite
should be moved to a separate issue...