multicoretests
multicoretests copied to clipboard
Specifications can be terser through more type trickery
As I was writing an STM specification for Dynarrays, I was a bit frustrated to have to add a catch-all assert false at the end of postcond. So I went into an over-engineering rabbit hole and found that it can be avoided by making the cmd type of specifications a GADT, which allows to specify the result type for commands. The typechecker is then able to deduce the result type when pattern matching on commands.
As an example, I have rewritten src/atomic/stm_tests.ml to match the new Spec signature.
This has a number of benefits and some drawbacks.
Benefits
- When writing the
runfunction, results no longer need to be wrappend into theResconstructor. However, it is still necessary to also return an instance of'a ty_show; I have not found a simpler way to make the results printable. - When writing the
postcondfunction, it is no longer necessary to match on patterns likeRes ((Result (Int, Exn), _), v)for the typechecker to deduce the right type forv: the command name suffices. It is also no longer necessary to add a catch-all_ -> assert falseat the end. - Functions like
runare safer because the result type is checked.
Drawbacks
Defining a GADT in the specification adds its own boilerplate: most functions now need explicit type annotations with universal quantifiers; the user must also define an existential type packed_cmd and wrap the command generators in it. Such STM tests are harder to write when one is not familiar with qcheck-stm.
And of course, another big drawback is that it changes the signature of STM specifications.
For these reasons, I open this PR as a draft, mostly to share the idea.
This is excellent Olivier, thanks a bunch! :pray: I was hoping to pull off something like this initially, but couldn't quite get the type-encoding working...
As much as I love and agree with the benefits and in particular the added safety, some of the additional user-input required sticks out to me, e.g., the 'a cmd vs packed_cmd distinction. I think I see why you need this:
- a packed (existential) type for
arb_cmdso that those can unify in the generator, and - a GADT
'a cmdso that we can exhaustively match inpostcondand capture the typed result precisely inrun
At the same time, as an end-user I probably don't want to care for such a cmd distinction... :shrug:
QCheck's Test.make is an example of a hook that sweeps an existential type parameter under the rug for 95% of end-users. I would love if we could pull off something similar here.
I'm not too concerned about breaking the API at this point, if we can arrive at something nice.
There are only few users outside this repo (lockfree and ortac-qcheck-stm) which we should be able to help migrate.
This makes we wonder, whether it is possible to pull something like this off if we permit ourselves an API redesign - even a radical one.
- Initially I went with
arb_gen : cmd arbitrarybecause that allowed to capture both generation and printer in one (like QCheck). In a redesigned API, this is not set in stone. For QCheck2, theshowfunction is decoupled from the generator. - A more radically different API could couple the generator,
show-function,run, andpre/postcondto the individual command, similar to Hedgehog Haskell or Hedgehog Scala. Not sure that would be a good fit for OCaml though. Kotlin's propCheck decouples them more along the lines ofSTM... :thinking:
At the same time, as an end-user I probably don't want to care for such a
cmddistinction... 🤷 QCheck'sTest.makeis an example of a hook that sweeps an existential type parameter under the rug for 95% of end-users. I would love if we could pull off something similar here.
I would also love to sweep the existential under the rug, but this seems to me to clash with the fact that the spec writer needs to write a command generator. Since, as you say, the typechecker needs to unify all possible command types there.
Perhaps we could hide it in a function similar to Test.make, but what would the type of that function be?
In the latest commit, the existential type is hidden behind a module with a nicer interface. The new compromise is that the sure must instantiate a small functor—see the new src/atomic/stm_tests.ml—and the distinction between 'r cmd and Cmd.any still manifests itself when writing the generator.
The result is maybe syntactically closer to the existing, and hopefully the user does not have to deal with the cryptic error messages that GADTs can give.
A good news is that the same tricks (not implemented in this PR, but I have it on the side) allow to get rid of the calls to Obj.magic in Lin:
https://github.com/ocaml-multicore/multicoretests/blob/c9930453f1ab8ee38075c48182fe070dc454e05d/lib/lin.ml#L422-L427
without changing the tests that use the high-level Lin interface in any way. Tests that use Lin.Internal do have to be adapted (in much the same way as in this PR), but given the unstable nature of that interface, I feel this will be less of an issue.