dune icon indicating copy to clipboard operation
dune copied to clipboard

How can I generate a beautiful report from `dune runtest`?

Open kindaro opened this issue 5 years ago • 9 comments

I have a bunch of custom tests. In the future I may also have cram tests. I have written a dune file that looks like this:

(test
 (name empty_expressions)
 (libraries utest lib)
)

This file should run a single custom test. I have other tests in other directories. When I type dune runtest, I may see something like this:

% dune runtest --force
Entering directory '/srv/src/coq'
empty_expressions alias test-suite/unit-tests/coqpp/runtest (exit 1)
(cd _build/default/test-suite/unit-tests/coqpp && ./empty_expressions.exe)
Done: 789/793 (jobs: 1)%                                                                                               

This means that one test failed. But it is not quite readable. Compare with the output of another test suite that I have previously written in a PureScript project:

→ Suite: Invertibility
  → Suite: (Options { allNullaryToStringTag: true, sumEncoding: (TaggedObject { contentsFieldName: "contents", tagFieldName: "tag" }), tagSingleConstructors: false })
    ☠ Failed: SingleNullary as {} because expected (Right SingleNullary), got (Left "When decoding a SingleNullary: Expected an empty array!")
    ☠ Failed: (SingleUnary 1) as {} because expected (Right (SingleUnary 1)), got (Left "When decoding a SingleUnary: Value is not a Number")
    ☠ Failed: (SingleBinary 1 2) as {} because expected (Right (SingleBinary 1 2)), got (Left "When decoding a SingleBinary: Value is not a Number")
    ✓ Passed: (RecordUnary { recordUnaryField1: 1 }) as {"recordUnaryField1":1}
    ✓ Passed: (RecordBinary { recordBinaryField1: 1, recordBinaryField2: 2 }) as {"recordBinaryField2":2,"recordBinaryField1":1}

How can I have a similarly gorgeous report with dune based test suite? Note that dune knows the directory structure and the name of the custom test, so it has the information needed to draw the hierarchy and to call by name. It may also read the output of the program and display it as the reason of the failure.

I could make my custom tests write something beautiful to the console, but that would not extend to cram tests, would not be aware of the directory structure, would not enforce uniformity, and would not mix nicely with other dune output.

kindaro avatar Nov 11 '20 15:11 kindaro

This was discussed in the dune meeting and we acknowledge that this a problem. Unlike in purescript, there's no unified test library for OCaml, so we need a more flexible solution.

We think that the most appropriate solution for is a custom output format for binaries that dune can understand and present to the user. We don't have the time to implement such a thing, but Arseniy & Cameron agreed to think about a possible design. Then someone else can tackle the implementation if they're interested in this feature.

The proposed design should address the following points:

  • The output should be versioned.
  • Dune should have a way to distinguish between binaries outputting structured output vs. normal output.
  • Binaries that write structured output should still be usable outside of dune.

rgrinberg avatar Nov 11 '20 17:11 rgrinberg

Your points address the question of interaction between dune and a single check. How about the bigger picture?

  • The first step might be to put some green checks and red crosses on the screen. For extra points:
    • Hierarchically indented.
    • Accompanied with the names of the checks as defined in dune files.
  • As I see it, the system should overarch both custom and cram tests.
  • What if tests are mixed with other tasks in a given run, and those tasks also want to write something on the screen? Output would be mangled.

I hope the design would address these as well.

kindaro avatar Nov 11 '20 19:11 kindaro

The first step might be to put some green checks and red crosses on the screen. For extra points:

I agree. However, we can only start dealing with visuals once we have some data to work with.

As I see it, the system should overarch both custom and cram tests.

We can certainly try to adapt the cram tests to write structure output.

What if tests are mixed with other tasks in a given run, and those tasks also want to write something on the screen? Output would be mangled.

It will not be mangled because dune does not let tasks write something on the screen.

I hope the design would address these as well.

Your input is appreciated here. Are you interested in helping out with the implementation as well?

rgrinberg avatar Nov 11 '20 19:11 rgrinberg

One way of getting test reports with custom formatting is by writing custom rules that run tests and generate those reports.

The downside is that you can't make dune treat those errors as errors, and therefore dune will cache the results, which is not something you want for non-deterministic failures.

Another downside is that you won't be able to see the report generated incrementally, only as a whole when everything is done.

If those downsides are not a problem, the upside is that the report can be completely custom and designed independently from dune itself.

aalekseyev avatar Nov 11 '20 19:11 aalekseyev

The first step might be to put some green checks and red crosses on the screen. For extra points:

I agree. However, we can only start dealing with visuals once we have some data to work with.

We have one bit of data from each custom check: is the exit status zero? This is enough for the view that I propose. I suppose cram tests can also be distilled to a bit in some way.

Are you interested in helping out with the implementation as well?

I have already given too many promises to open source communities that I now have a hard time fulfilling, I should refrain from giving more. It also strategically depends on the success of my coöperation with the Coq team and whether the design is simple or complicated.

kindaro avatar Nov 11 '20 19:11 kindaro

I belatedly realize: since test is merely an alias for action, dune has no way of knowing ahead of time which tasks are custom tests. The only way dune may find that out is if, upon being run, the program that is a custom test advertises itself as such by printing a «magic string». This means that my «one bit» design cannot work without the custom output format feature.


Possibly dune may set a pre-defined environment variable when running actions? Upon seeing such a variable being set, a custom test would output the magic string. Otherwise it may output whatever it pleases. That would ensure the requirement:

Binaries that write structured output should still be usable outside of dune.

— Is satisfied.

kindaro avatar Nov 13 '20 10:11 kindaro

@kindaro , indeed that's the case today, however it is easy to have rule (you mean rule above I think, not action) to produce test targets.

ejgallego avatar Nov 13 '20 17:11 ejgallego

Possibly dune may set a pre-defined environment variable when running actions? Upon seeing such a variable being set, a custom test would output the magic string. Otherwise it may output whatever it pleases. That would ensure the requirement:

That makes sense, but note that it would be completely transparent to the user. E.g. in the test, the user would call something like:

val start_test : name:string -> unit

And this would produce the output wherever dune needs it only if INSIDE_DUNE is set. So the user doesn't need to know about any environment variables.

rgrinberg avatar Nov 13 '20 20:11 rgrinberg

Did we make a progress on this issue in last 5 years? If not, how difficult it will be to trace in the end of build how many cram tests have failed/passed/skipped?

Kakadu avatar May 28 '25 08:05 Kakadu