rackunit test failures not detected when using nested test cases

It seems that "check" failures inside a nested test suite is not considered a "test" failure: when running the code below, note the (check = 0 1), the failure is reported to the console, but the test passes:

#lang racket
(require rackunit rackunit/text-ui)
(define ts1
  (test-suite
   "ts1 test-suite"
   (test-case "outer test case"
     (test-case "inner test case"
       (check = 1 0)))))
(module+ test
  (run-tests ts1))

When running the code with `raco test sample.rkt", I get:

raco test: (submod "sample.rkt" test)
--------------------
ts1 test-suite > inner test case
FAILURE
name:       check
location:   simple-test.rkt:9:7
params:     '(#<procedure:=> 1 0)
--------------------
1 success(es) 0 failure(s) 0 error(s) 1 test(s) run
0
1 test passed

So the check is failing, but the result shows that the test passed and the "raco test" process returns 0 (success).

Jul 03 '18 02:07 alex-hhh

What should the semantics be here? I agree that this is not great, but it's not clear what we want to happen.

Sep 16 '19 15:09 samth

In the example above a check failed, yet the summary line indicates that there were zero failures and the exit code from the raco test program is 0, indicating success. What I would like to happen instead is:

the summary line should be "0 success(es) 1 failure(s) 0 error(s) 1 test(s) run" and "0 tests passed" -- when there is lots of output from the test, I tend to just look at the summary line to check if all tests passed
the return code from the raco test should be non-zero, so, a continuous integration pipeline will fail if tests fail.

Alternatively, nested test cases should trigger a compilation or runtime error, as it is an easy mistake to write nested test cases, and their behavior is surprising, at least.

Sep 16 '19 22:09 alex-hhh

(Reposted from #136 to keep discussion in one place)

This does seem related to "nested" test cases but there's more to it. Some observations using the example in the description [in #136 ]:

If we replace test-equal? with the equivalent check-equal? form, it reports the fail count correctly.

(define tests
  (test-suite
   "Top"
   (test-case "Test case"
     (check-equal? "A"
                   "B" "A is equal? to B"))))

=> 0 success(es) 1 failure(s) 0 error(s) 1 test(s) run

If we remove the test-case altogether, and replace the test with a simple check-equal? form, it once again reports the count incorrectly:

(define tests
  (test-suite
   "Top"
   (check-equal? "A"
                 "B" "A is equal? to B")))

=> 1 success(es) 0 failure(s) 0 error(s) 1 test(s) run

Personally, I didn't even know that test-equal? existed, which is a shortcut for (test-case ... (check-equal? ... )), and have been using the second form (i.e. check-* within test-suite) in my tests extensively, in addition to using test-case.

Note that even though the check doesn't count as failed in the summary, it does still count as one test. Having n checks results in n test(s) run. I think we'd agree that it should either count as a test and also count for success/failure or it should count for neither, but not count for one and not the other. So some possible ways in which it should work are:

check-* forms that aren't contained in a test-case -- whether they are enclosed in a test-suite or not -- should be counted individually, i.e. as test cases, for success or failure. test-case is an optional wrapper to add additional context. This would avoid the need to rewrite existing top level checks to wrap them in test-case* forms in order to have them count. The other possibility is to disallow checks that aren't enclosed in some form of test-case form. From my vantage point, I'd prefer the former, especially since it preserves backwards compatibility.
Re: nested forms, I actually really like the flexibility of being able to nest forms with abandon. But it seems that rackunit is somewhere between two versions of itself here. On the one hand, if we allow arbitrary nesting, why not just have a single test form like test-case (in addition to the check- forms) which can be nested to arbitrary depth, and the counting can just be all the terminals, i.e. checks. In this solution, test-case means "the list of tests", whose members can either be test-cases or checks (providing the base case). On the other hand, if we have both test-suite and test-case to capture "the list of tests" and "a specific test", respectively, we could say that since test-suites can already be nested, that nesting a test-case isn't necessary -- we could just replace any sequence of (test-case ... (test-case ...)) with (test-suite ... (test-case ...)), with only the terminal being a test-case and all the enclosing ones being test-suites. In this case, rackunit could signal a compile error if a test-case-within-a-test-case is encountered, as the author suggested in #98 , or it could just treat the whole thing as one test, even if arbitrarily nested.

On point 2, I think I'd lean towards coalescing test-case and test-suite to be just one form, maybe aliasing one to the other, so that they can be arbitrarily nested. It might also be easier to define higher-level operations on tests this way, e.g. composing two tests to create a union test, for whatever that may be worth, or extracting the component of tests that failed as a fresh test programmatically. Not sure if this could work, but it might enable a way to programmatically iterate a test-fix-rerun loop. But since the source identifiers are statically bound, I'm not sure if this would be useful unless we could dynamically load those identifiers to re-run an evolving set of tests. Now this is just crazy talk 😄

Aug 23 '21 02:08 countvajhula

For what is worth, I wrote my own test runner package, al2-test-runner, which does not have the problem I reported here, so I am no longer affected by this issue. My package also does not have a problem with the issue reported in #136 either.

If I understand correctly, the rackunit/text-ui cannot be modified, so it might not be able to fix this issue...

In case you want to use al2-test-runner, all you need to do is replace rackunit/text-ui require with al2-test-runner, rest if the tests and test cases don't need to change.

Aug 23 '21 03:08 alex-hhh

Realistically we can't change test-suite to be an alias of test-case because of backwards compatibility.

Honestly my take is that test suites were a mistake and shouldn't have existed. My preference would be to move all of the test suite related functionality into a separate section of the docs with "Legacy API" in the heading.

Sep 03 '21 10:09 jackfirth

Deprecate test-suite.

Sep 03 '21 10:09 mfelleisen

Why is test-suite a mistake? Why should it be deprecated?

FWIW, it's the only facility that allows "setup" and "teardown", which are really useful. If it's to be deprecated, there should be something else to replace it that implements these features.

Sep 03 '21 10:09 sorawee

Would something like this be possible?

check-* evalutes to a test result, as it currently does
test-case comprises either test-cases or checks
test-case evaluates to a value representing a test (it does not run the test)
test-case supports test-suite's #:before and #:after
run-tests accepts a test value defined using test-case

Sep 03 '21 21:09 countvajhula

The problem that I reported in this issue is a problem with run-tests, and not a problem with test-suite or test-case. I know this because I implemented my own version of run-tests which runs the same test suites and does not have this problem. Also rackunit's GUI test runner, will run the same test suite and correctly report the failure:

#lang racket
(require rackunit rackunit/gui)
(define ts1
  (test-suite
   "ts1 test-suite"
   (test-case "outer test case"
     (test-case "inner test case"
       (check = 1 0)))))
(module+ test
  (test/gui ts1))

There may be other problems with rackunit, which I don't fully understand, and perhaps there a are good reasons for removing test-suite, and perhaps removing test-suite will also inadvertently fix this problem.

However, I think that the suggestions made by @countvajhula and @jackfirth will not directly address the issue I reported and perhaps proposals to remove test-suite should be moved to a separate issue...

Sep 03 '21 22:09 alex-hhh

rackunit rackunit copied to clipboard

test failures not detected when using nested test cases

rackunit
rackunit copied to clipboard