multicoretests Add information to logs with a custom runner

Add a bash script to use as runner for our test suite

This script adds information to the logs:

explicit names of the test run when they start
some explicit messages for crashes (SIGSEGV, SIGBUS, ...)
and, maybe most importantly, anchors in CI logs, so that the main webpage contains the most important information and direct links to precise points of interest in the log

It is written as a bash script but making sure it can be used as a runner also on Windows CI (notably by commenting all ends of lines, without which it fails with errors about '\r's that don't exist...)

Example of result of the changes: the webpage for this run shows directly that we got a SEGV in Ephemeron on trunk (windows). Or this run with SIGABRT in buffers on trunk on macos.

Nov 21 '22 21:11 shym

I'm a bit hesitant about this one. As someone who has scanned CI logs for the past two weeks I certainly follow you on the benefits!

On the negative side,

this adds back in a custom runner now that we finally got rid of the clunky check_error_count
it is in bash - which is untyped, has lots of quirks (witness the newline issue), and has caused problems for people before (I think there was even an incident at JaneStreet)

I'm wondering if this is something

that could/should be handled in dune or in QCheck_base_runner instead
or something we could handle with a custom runner?

I've been thinking of the latter to avoid printing counterexamples twice (with and without return values)...

Nov 25 '22 10:11 jmid

I do agree it’d be nicer to go without bash. I had tried first to see if it would be possible to do in dune, but I couldn’t figure it out (see the fact that the error code shown in windows for a segfault is completely different from what it reports in linux). The custom runner would be the nicest solution, especially if we also gain on the actual output.

Nov 28 '22 17:11 shym

Now, to plead the case for that PR:

we are the end users for the test suite and can make choices that would otherwise be clearly dangerously fragile, so I thought that going with bash would not be a problem if it works in the environments where we run the suite,
as we see in our logs, segfaults are logged differently between Windows and Unix (negative exit value instead of actual signal) and, according to a few small tests, this problem actually comes with using Unix.system or Sys.command; so writing a custom (OCaml) runner will entail merging both behaviours, what bash is already doing for us.

Nov 29 '22 16:11 shym

For the record, I thought using the echo dune stanza could at least provide the feature of having the path of the test being run before it starts but my attempt failed: https://github.com/shym/multicoretests/actions/runs/3591271602/jobs/6045600378#step:10:41

Dec 01 '22 11:12 shym

Overall, I can see benefit to the CI log anchors and would like to salvage them!

I'm wondering whether a good solution would be to extend QCheck to output the right format when running under a suitable environment variable, e.g., QCHECK_GA_CI as that could benefit more QCheck CI users and avoid introducing a specialized runner script. :thinking:

What do you think?

How different an output should be printed to achieve that, e.g., for the below? (I see some "::-header" printing in the PR code)

random seed: 337549857
generated error fail pass / total     time test name
[✓] 1000    0    0 1000 / 1000     1.2s STM Bytes test sequential
[✓]   11    0    1   10 / 1000    57.4s STM Bytes test parallel

--- Info -----------------------------------------------------------------------

Negative test STM Bytes test parallel failed as expected (21 shrink steps):

                             |           
                             |           
                  .---------------------.
                  |                     |           
                To_seq          (Fill (7, 5, 'K'))  


+++ Messages ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Messages for test STM Bytes test parallel:

  Results incompatible with linearized model

                                                                                               |                                            
                                                                                               |                                            
                                                 .------------------------------------------------------------------------------------------.
                                                 |                                                                                          |                                            
     To_seq : ['a'; 'a'; 'a'; 'a'; 'a'; 'a'; 'a'; 'a'; 'K'; 'K'; 'K'; 'K'; 'a'; 'a'; 'a'; 'a']                                (Fill (7, 5, 'K')) : Ok (())                               

================================================================================
success (ran 2 tests)

Jan 31 '23 11:01 jmid

The documentation explains here the syntax to report an error and create the anchor; tldr: echo "::error title=Failure in STM Bytes test parallel::STM Bytes test parallel failed on its 11th run after 57.4 seconds with seed 337549857". That would nice indeed. The bash runner provides one thing particularly useful for multicoretests that would be harder to bring into QCheck: it can report when a test crashes (on a signal). That would require wrapping all our tests in some sort of fork_prop (and ensuring we manage to report it as the proper signal even on Windows, something that dune fails at, by the way).

Jan 31 '23 13:01 shym

Ah, this old brain is finally starting to understand :sweat_smile: That's why you need a "parent process" to catch crashes and print the anchor (rather than just crash and burn).

For the weird Windows signalling, I think we should report it along with test case if possible to the dune developers.

Jan 31 '23 14:01 jmid

BTW, thanks for the documentation link - much appreciated! :pray: Potentially QCheck could ::group ... ::endgroup a list of tests in a test suite in such a mode.

Jan 31 '23 14:01 jmid

multicoretests multicoretests copied to clipboard

Add information to logs with a custom runner

multicoretests
multicoretests copied to clipboard