picos
picos copied to clipboard
More tests for `picos.sync`
I may have observed the picos.sync
tests potentially dead/livelocking at least on (32-bit) OCaml 4.14 on CI. This might indicate a bug in the picos.sync
library, a bug in the test, a bug in (32-bit) OCaml 4.14 (I don't recall seeing the test not completing on other OCaml versions, but I might have simply missed that), or it might be a completely unrelated thing (test machine being slow for some other reason). At any rate, this needs to be investigated further and the correctness of the picos.sync
library implementation ensured.
Observations:
-
debian-12-4.14_arm32_opam-2.1
(not completed after 24+ minutes, completed very quickly after cancel+rebuild) - If
thread-local-storage
is (for some reason) not installed, it was possible, before #110, to build a non-working set of libraries where the mutex cancelation test and benchmarks did not terminate. This shouldn't really be the case with the observed non-completion. - Tried running the picos_sync test repeatedly in parallel (dozen or so) with OCaml 4.14.2 on macOS with M1. Did not get any lockups within a few hours.
-
debian-12-4.14_arm64_opam-2.1
(not completed in an hour) -
debian-12-4.14_opam-2.2
(seemed to be stuck in the cancelation test)
It might be that the issue was related to the cancelation test spawning fibers, which translate to systhreads on OCaml 4. PR #230 changes the tests to not spawn fibers. Time will tell whether this eliminates the hangs on OCaml 4.
Addition: There was a test run where the cancelation test did not seem to complete on 4.14 arm64. Not spawning lots of systhreads seems to have made the failures less common.
@edwintorok mentioned about the pthread_cond_wait
bug, which might be the cause of the issues.