dynamorio
dynamorio copied to clipboard
ASSERT tool.drcachesim.scattergather test: tracer.cpp:394: towrite <= ipc_pipe.get_atomic_write_size() && towrite > 0
tool.drcachesim.scattergather failed:
https://github.com/DynamoRIO/dynamorio/runs/5077131960?check_suite_focus=true
ASSERT FAILURE: /home/runner/work/dynamorio/dynamorio/clients/drcachesim/tracer/tracer.cpp:394: towrite <= ipc_pipe.get_atomic_write_size() && towrite > 0 ()
I don't remember ever seeing that assert before on any drcachesim test.
This was the 32-bit x86 test
Happened again: https://github.com/DynamoRIO/dynamorio/runs/5131569686
I ran the 32-bit test 1000 times on my machine, and couldn't reproduce the failure:
$ ctest -VV -R 'tool.drcachesim.scattergather' --repeat-until-fail 1000
...
1/1 Test #284: code_api|tool.drcachesim.scattergather ... Passed 1.51 sec
The following tests passed:
code_api|tool.drcachesim.scattergather
100% tests passed, 0 tests failed out of 1
Total Test time (real) = 1547.39 sec
I can try running it on the Github Actions runner using tmate. In the past, we've had failures which reproduce only on GA.
I ran this 1000 times on a GA runner, no failure encountered. Maybe it reproduces on only some runners? If that is so, I'll need to try this multiple times.
ctest -VV -R 'tool.drcachesim.scattergather' --repeat-until-fail 1000
...
1/1 Test #286: code_api|tool.drcachesim.scattergather ... Passed 2.06 sec
The following tests passed:
code_api|tool.drcachesim.scattergather
100% tests passed, 0 tests failed out of 1
Total Test time (real) = 2024.27 sec
My PR #5515 failed to pass test 286 multiple times on the x86_32 action. https://github.com/DynamoRIO/dynamorio/runs/7048743130?check_suite_focus=true
Looks like the following tests are failing deterministically on my machine (and Derek's) now:
221 - code_api|client.drx-scattergather (Failed)
248 - code_api|sample.memval_simple_scattergather (Failed)
307 - code_api|tool.drcachesim.scattergather (Failed)
Would make it easier to debug! Will take it up sometime soon.
Xref a different failure in the same test: #5747
Looks like the following tests are failing deterministically on my machine (and Derek's) now:
The failure I found on my machine was due to a different error, which I fix in #5764. The original assert failure in this issue shows up on our Ubuntu22 CI though.
This test can fail with this pipe assert 3x in a row, failing even with the retry-3x feature from #2204 for #5873