multicoretests icon indicating copy to clipboard operation
multicoretests copied to clipboard

[ocaml5-issue] MSVC timeout and crashes in domain_spawntree - with Atomic

Open jmid opened this issue 1 year ago • 2 comments

While running CI for #458, the MSVC trunk bytecode workflow timed out (deadlocked?) in domain_spawntree - with Atomic after 6h and having run only a few tests:

https://github.com/ocaml-multicore/multicoretests/actions/runs/9115661686/job/25062501460?pr=458#logs

Skipping src/io/lin_internal_tests.exe from the test suite
[...]
Skipping src/neg_tests/lin_internal_tests_effect.exe from the test suite


random seed: 333893351
generated error fail pass / total     time test name

[ ]    0    0    0    0 /  100     0.0s Domain.spawn/join - tak work
[ ]    0    0    0    0 /  100     0.0s Domain.spawn/join - tak work (generating)
[ ]   17    0    0   17 /  100    60.6s Domain.spawn/join - tak work
[ ]   37    0    0   37 /  100   122.7s Domain.spawn/join - tak work
[ ]   57    0    0   57 /  100   188.3s Domain.spawn/join - tak work
[ ]   73    0    0   73 /  100   251.9s Domain.spawn/join - tak work
[ ]   88    0    0   88 /  100   312.9s Domain.spawn/join - tak work
[✓]  100    0    0  100 /  100   347.0s Domain.spawn/join - tak work

[ ]    0    0    0    0 /  500     0.0s Domain.spawn/join - atomic
[ ]   95    0    0   95 /  500    26.1s Domain.spawn/join - atomic
[ ]  302    0    0  302 /  500    86.2s Domain.spawn/join - atomic
[✓]  500    0    0  500 /  500   139.1s Domain.spawn/join - atomic
================================================================================
success (ran 2 tests)

random seed: 476318298
generated error fail pass / total     time test name

[ ]    0    0    0    0 /  100     0.0s domain_spawntree - with Atomic
[ ]    0    0    0    0 /  100     0.0s domain_spawntree - with Atomic (generating)
Error: The operation was canceled.

jmid avatar May 17 '24 09:05 jmid

Saw this again - but now in MSVC native mode - and causing a crash: https://github.com/ocaml-multicore/multicoretests/actions/runs/9126100215/job/25093649961?pr=458

random seed: 251075488
generated error fail pass / total     time test name

[ ]    0    0    0    0 /  100     0.0s domain_spawntree - with Atomic
File "src/domain/dune", line 14, characters 7-23:
14 |  (name domain_spawntree)
            ^^^^^^^^^^^^^^^^
(cd _build/default/src/domain && ./domain_spawntree.exe --verbose)
Command exited with code -1073741819.
[ ]    0    0    0    0 /  100     0.0s domain_spawntree - with Atomic (generating)

jmid avatar May 17 '24 10:05 jmid

Note to self: Exit code -1073741819 corresponds to c0000005

Printf.sprintf "%lx" (-1073741819l);;
- : string = "c0000005"

which indicates STATUS_ACCESS_VIOLATION, i.e., to Windows correspondent of a segfault: https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-erref/596a1078-e883-4972-9bbc-49e60bebca55

jmid avatar May 17 '24 11:05 jmid