testthat icon indicating copy to clipboard operation
testthat copied to clipboard

Improving the output of the check reporter

Open gaborcsardi opened this issue 4 months ago โ€ข 4 comments

Motivation

One frustration I have with the check reporter is the lack of output, which make it hard to debug issues that are external to testthat. Common examples are messages from sanitizers, valgrind or crashes.

For example a recent run on the R-hub m1-san machine gave me this output:

[...]
  > library(testthat)
  > library(otelsdk)
  > 
  > test_check("otelsdk")
  /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX15.4.sdk/usr/include/c++/v1/__chrono/duration.h:92:79: runtime error: signed integer overflow: 9221621549533571493 * 1000 cannot be represented in type '_Ct' (aka 'long long')
  SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX15.4.sdk/usr/include/c++/v1/__chrono/duration.h:92:79 
  [ FAIL 0 | WARN 0 | SKIP 30 | PASS 172 ]
[...]

(https://github.com/r-lib/otelsdk/actions/runs/15792938235/job/44521285942#step:6:252)

Goals

  • Make it easier to match messages from test cases to tests.
  • Keep a summary of the test results at the end, so R CMD check prints it.
  • Informative messages for errors and test failures, as with all reporters.

I admit, that I keep forgetting why the output of the check reporter is so terse, so another important goal is potentially missing here.

Suggestion

I have an implementation for this output in a new reporter in the testthatlabs package. The output from this reporter, including the failure above looks like this:

  โฏ otelsdk test suite โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
  
       โ€บ checks 1 ยป is.na ..... ... [89ms]
       โ€บ checks 13 ยป is_string ..... . [13ms]
       โ€บ checks 23 ยป is_named ..... ... [17ms]
  
  SKIP โ€บ 'test-checks.R:37:3' ยป as_timestamp [On CRAN]
  SKIP โ€บ 'test-checks.R:63:3' ยป as_span [On CRAN]
  SKIP โ€บ 'test-checks.R:80:3' ยป as_span_context [On CRAN]
  
       โ€บ checks 95 ยป as_span_parent .... [20ms]

...

  SKIP โ€บ 'test-logger-provider-http.R:2:3' ยป logger_provider_http [On CRAN]
  SKIP โ€บ 'test-logger-provider-http.R:27:3' ยป logger_provider_http_options [On CRAN]
  
  
       โ€บ logger-provider-stdout 1 ยป logger_provider_stdstream .. [7ms]
       โ€บ logger-provider-stdout 8 ยป log to file ../Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX15.4.sdk/usr/include/c++/v1/__chrono/duration.h:92:79: runtime error: signed integer overflow: 9221617979017900176 * 1000 cannot be represented in type '_Ct' (aka 'long long')
  SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX15.4.sdk/usr/include/c++/v1/__chrono/duration.h:92:79 
  . [44ms]
  
  SKIP โ€บ 'test-logger-provider-stdout.R:39:3' ยป log to file [On CRAN]
  SKIP โ€บ 'test-logger-provider-stdout.R:63:3' ยป logger_provider_stdstream_options [On CRAN]

...

       โ€บ utils 189 ยป plural ... [7ms]
       โ€บ utils 195 ยป find_instrumentation_scope . [4ms]
       โ€บ utils 204 ยป empty_atomic_as_null ..... [9ms]
  
  PASS x370  FAIL x0  WARN x0  SKIP x119  [4.3s]                                  

(https://github.com/r-lib/otelsdk/actions/runs/16677280319/job/47206943447#step:6:446)

Summary of the changes:

  • Report every test block.
  • Create a dot for every expectation, in groups of five, for easier counting.
  • Make special events (skips, warning, test failures) easy to discover from the output. You only need to scan the beginning of the lines to find them.

Screenshot of a test suite with special events, running locally:

Image

gaborcsardi avatar Aug 07 '25 07:08 gaborcsardi

I like this idea!

I think the key thing in terms of R CMD check output is that the last 13 lines are special because that's what R CMD check will display. But I think I massively over-indexed on that when creating the reporter. (Also don't forget that it will be run in parallel in many cases, but only with two cpus)

hadley avatar Aug 07 '25 12:08 hadley

Yeah, I haven't really figured out yet how to have a nice output in parallel. Let me figure that out, and then I can submit a PR. (It does work in parallel, all reporters do, but the output will be buffered until a file is done, which is not great to debug crashes.)

Btw. IDK if it is better to update the check reporter, or add a new reporter.

gaborcsardi avatar Aug 07 '25 13:08 gaborcsardi

We already have so many reporter who's purpose has been long forgotten that it's easy enough to add one more.

hadley avatar Aug 07 '25 14:08 hadley

Ideally it would replace the check reporter at some point, I think. But definitely makes sense to have a new reporter for it first, and then we'll see later.

I'll work out how to interleave the output from parallel workers. Of course it is not very important to implement this in the imminent release, it is not urgent at all.

Parallel testthat is actually great for crashes, because the crash is in the subprocess and we can report it correctly in the main process. It is also OK to buffer the test results and output from the workers, nothing will be lost or reported at the wrong place or time.

gaborcsardi avatar Aug 13 '25 13:08 gaborcsardi