multicoretests Lin: Print commands that trigger unexpected exceptions

The standard QCheck mechanism to catch exceptions that are raised in tests do not give much clues about the source (ie Lin command) of that exception So this creates a specific Unexpected_exception that will pack the command along with the original exception so that it can be explained when the tests errors out

Before:

exception Invalid_argument("Bytes.blit")

After:

exception Invalid_argument("Bytes.blit") unexpected in command Buffer.add_string t ""

Apr 06 '23 11:04 shym

This is both a nice improvement and elegantly implemented - many thanks! :smiley:

Could you add a test case to test/ to guard against someone (read: me) breaking this by mistake in the future?
I think a CHANGES entry is also called for, as this is a quality-of-life improvement for users!
Finally I'm wondering whether the same approach could apply to STM too...

Apr 12 '23 07:04 jmid

I finally got round to update that PR with your suggestions. It also takes care now to preserve the original backtrace.

Jun 05 '23 15:06 shym

Well, that’s quite a success... ~~I don’t get the same output when running the new internal tests locally and I don’t understand where this can come from.~~ I finally saw that the difference seems to be in the necessary time to run the test! 0.0s locally, 0.1s or 0.2s in CI, depending on the runner.

Jun 05 '23 16:06 shym

The Random module changed between OCaml 4 and 5, and as a result the output distribution changed. Overall that means we cannot expect the same output with the same seed across the two - or the same number of shrinking steps (printed) as a result. :shrug:

Furthermore, the distribution varies between 32 bit and 64 bit architectures. As a result there are 4 possible combinations: OCaml{4,5} * {32,64}-bit). test/mutable_set_v5.ml thus has 4 different expect-files to handle these. It is not particularly pretty... :sweat_smile:

It the present case we may just want to filter the output a bit. I recently learned of dune's (pipe-outputs ...) which can be handy for this sort of thing in combination with sed. There's an example of it here.

Jun 06 '23 08:06 jmid

sed is indeed the path I was following :smile: I might have been a bit radical in my filtering (removing all progress lines), but I expect this is good enough for our purpose.

Jun 06 '23 09:06 shym

The PR now installs a handler on every cmd interpreted in a central hot path of both Lin and STM. This can potentially affect our ability to trigger parallel or concurrency issues, so we should carefully review the CI logs. Did the first Lin version also do so? (I may be misremembering)

It did, but it was inlined at the site of the call. I don’t know how keen the compiler is to inline across modules but the function is modeled after Fun.protect. For Lin DSL, we could move that code and try to use it only when the tested function does not declare it can throw exceptions. On the other hand, the function invocation is already a quite long path in Lin DSL. It is maybe a bit more troublesome, in cases where the model is really simple (ie when next_state and postcond are really cheap).

I fixed your small suggestion, also as a way to trigger another round of testing.

Jun 07 '23 09:06 shym

Proposition of an alternate design for that functionality: maybe we could install the catch-all exception handler only at the very start of the execution of the whole sequence of commands on each domain, and using domain-local storage to store the currently running command. I’m not sure we have a similar safe storage mechanism for systhreads though :sweat_smile: According to @OlivierNicole, installing an exception-handler is only 3 machine instructions, so I’m not sure this alternative would be cheaper. But I could update this PR by replacing uses of tag_exn_with by explicit trys (to avoid the extra function calls in the hot path), calling out a function in Util only in the handler itself. That would duplicate very little code.

Jun 08 '23 10:06 shym

I just updated this PR with the latest proposal: using an inline try with at the various calling sites, sharing only the code that it is run when an exception does appear. This should have the smallest possible impact on runs while catching those exceptions.

Jun 12 '23 11:06 shym