multicoretests Dynarray tests

(This PR is based on top of #462, which it depends on.)

The new module Dynarray has a rather tricky implementation as it is not designed for parallel use, but still strives to guarantee memory safety even in the event of such incorrect use. This implies a lot of subtle invariants to maintain. My local runs of this branch have exposed a bug in the initial implementation (https://github.com/ocaml/ocaml/pull/12885#discussion_r1568976695).

Jun 07 '24 16:06 OlivierNicole

Thanks a bunch for sharing this! :pray:

Here's a CI summary as I was seeing several red lights:

32bit 5.2 and 32bit trunk failed to compile with 0xdeadbeef out of int's range
linux-arm64-5.1, linux-s390x-5.1, freebsd-amd64-5.1 failed with Error: Unbound module Dynarray as Dynarray was a 5.2 addition
the remaining 24 non-opam-install workflows failed with Unbound value Dynarray.blit as the new binding has not yet been merged to 5.3/trunk: https://github.com/ocaml/ocaml/pull/13197

Out of 36 workflows the 29 test-running ones failed with compile-time errors

Jun 19 '24 19:06 jmid

I just pushed a few changes, partly to address some of the red lights. I still expect a 32-bit 0xdeadbeef out-of-range though.

Jun 19 '24 19:06 jmid

Thanks for taking care of that!

I still expect a 32-bit 0xdeadbeef out-of-range though.

We should probably replace it with 0xcafe, or something.

Also, the Lin test was triggering segfaults, presumably due to a leftover “operation” fake which was doing some extremely unsafe things, probably to trigger exactly that, and should be removed. Sorry for leaving it in the PR, I just pushed a commit to remove it.

Jun 20 '24 08:06 OlivierNicole

Remaining problems:

The STM tests fail as they are supposed to. Instead of agree_test_par and stress_test, we should have neg_agree_test_par and stress_test.
There seems to be some exceptions raised by computations in the model, which should not happen.

I don’t have the bandwidth to look at it today, but hopefully shortly.

Jun 20 '24 11:06 OlivierNicole

When devising a system to maintain a pool of Dynarrays as my SUT, I have neglected an issue with shrinking.

Some commands cause a new Dynarray to be added to the pool and others take one or more Dynarrays as input. These Dynarray arguments are represented as list indices; when generating commands, care is taken to generate indices that are valid indices in the current list of arrays, given the previous commands.

But when shrinking, some of the commands are arbitrarily removed, which leads to some of these indices to become invalid at the moment of the command’s execution and causes unexpected exceptions.

Jul 26 '24 09:07 OlivierNicole

I've taken a stab at this. I realized the same indexing issue with shrinking as you, and have (in parallel) Util.protected all indexed Dynarray calls. Furthermore I observed this taking a long time locally, so after tracing this to too long Dynarrays being created, I've adjusted the generator accordingly.

The result now completes in a reasonable time:

$ dune exec src/dynarray/stm_tests.exe -- -v
random seed: 287305118
generated error fail pass / total     time test name
[✓] 1000    0    0 1000 / 1000     0.0s STM Dynarray test sequential agreement (int)
[✓] 1000    0    0 1000 / 1000    86.9s STM Dynarray stress test (int)
[✓] 1000    0    0 1000 / 1000     0.0s STM Dynarray test sequential agreement (float)
[✓] 1000    0    0 1000 / 1000    65.6s STM Dynarray stress test (float)
================================================================================
success (ran 4 tests)

Finally, I've rebased on main - and force pushed (sorry!), with an eye to your changes, e.g., incorporating your equal_idx function.

Dec 19 '24 15:12 jmid

I was a bit blocked by the shrinking issue and hadn’t thought of Util.protect as a solution.

Thanks for doing what I neglected to do! This looks great to me.

Dec 19 '24 16:12 OlivierNicole

CI summary: all 33 workflows green.

Dec 20 '24 06:12 jmid

CI summary for merge to main:

Bytecode 5.3 timed out due to excessive shrinking in STM Weak test parallel #498

Out of 34 workflows, 1 failed with a false alarm (timeout)

Dec 20 '24 10:12 jmid