Improving the output of the check reporter
Motivation
One frustration I have with the check reporter is the lack of output, which make it hard to debug issues that are external to testthat. Common examples are messages from sanitizers, valgrind or crashes.
For example a recent run on the R-hub m1-san machine gave me this output:
[...]
> library(testthat)
> library(otelsdk)
>
> test_check("otelsdk")
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX15.4.sdk/usr/include/c++/v1/__chrono/duration.h:92:79: runtime error: signed integer overflow: 9221621549533571493 * 1000 cannot be represented in type '_Ct' (aka 'long long')
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX15.4.sdk/usr/include/c++/v1/__chrono/duration.h:92:79
[ FAIL 0 | WARN 0 | SKIP 30 | PASS 172 ]
[...]
(https://github.com/r-lib/otelsdk/actions/runs/15792938235/job/44521285942#step:6:252)
Goals
- Make it easier to match messages from test cases to tests.
- Keep a summary of the test results at the end, so
R CMD checkprints it. - Informative messages for errors and test failures, as with all reporters.
I admit, that I keep forgetting why the output of the check reporter is so terse, so another important goal is potentially missing here.
Suggestion
I have an implementation for this output in a new reporter in the testthatlabs package. The output from this reporter, including the failure above looks like this:
โฏ otelsdk test suite โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โบ checks 1 ยป is.na ..... ... [89ms]
โบ checks 13 ยป is_string ..... . [13ms]
โบ checks 23 ยป is_named ..... ... [17ms]
SKIP โบ 'test-checks.R:37:3' ยป as_timestamp [On CRAN]
SKIP โบ 'test-checks.R:63:3' ยป as_span [On CRAN]
SKIP โบ 'test-checks.R:80:3' ยป as_span_context [On CRAN]
โบ checks 95 ยป as_span_parent .... [20ms]
...
SKIP โบ 'test-logger-provider-http.R:2:3' ยป logger_provider_http [On CRAN]
SKIP โบ 'test-logger-provider-http.R:27:3' ยป logger_provider_http_options [On CRAN]
โบ logger-provider-stdout 1 ยป logger_provider_stdstream .. [7ms]
โบ logger-provider-stdout 8 ยป log to file ../Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX15.4.sdk/usr/include/c++/v1/__chrono/duration.h:92:79: runtime error: signed integer overflow: 9221617979017900176 * 1000 cannot be represented in type '_Ct' (aka 'long long')
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX15.4.sdk/usr/include/c++/v1/__chrono/duration.h:92:79
. [44ms]
SKIP โบ 'test-logger-provider-stdout.R:39:3' ยป log to file [On CRAN]
SKIP โบ 'test-logger-provider-stdout.R:63:3' ยป logger_provider_stdstream_options [On CRAN]
...
โบ utils 189 ยป plural ... [7ms]
โบ utils 195 ยป find_instrumentation_scope . [4ms]
โบ utils 204 ยป empty_atomic_as_null ..... [9ms]
PASS x370 FAIL x0 WARN x0 SKIP x119 [4.3s]
(https://github.com/r-lib/otelsdk/actions/runs/16677280319/job/47206943447#step:6:446)
Summary of the changes:
- Report every test block.
- Create a dot for every expectation, in groups of five, for easier counting.
- Make special events (skips, warning, test failures) easy to discover from the output. You only need to scan the beginning of the lines to find them.
Screenshot of a test suite with special events, running locally:
I like this idea!
I think the key thing in terms of R CMD check output is that the last 13 lines are special because that's what R CMD check will display. But I think I massively over-indexed on that when creating the reporter. (Also don't forget that it will be run in parallel in many cases, but only with two cpus)
Yeah, I haven't really figured out yet how to have a nice output in parallel. Let me figure that out, and then I can submit a PR. (It does work in parallel, all reporters do, but the output will be buffered until a file is done, which is not great to debug crashes.)
Btw. IDK if it is better to update the check reporter, or add a new reporter.
We already have so many reporter who's purpose has been long forgotten that it's easy enough to add one more.
Ideally it would replace the check reporter at some point, I think. But definitely makes sense to have a new reporter for it first, and then we'll see later.
I'll work out how to interleave the output from parallel workers. Of course it is not very important to implement this in the imminent release, it is not urgent at all.
Parallel testthat is actually great for crashes, because the crash is in the subprocess and we can report it correctly in the main process. It is also OK to buffer the test results and output from the workers, nothing will be lost or reported at the wrong place or time.