zig icon indicating copy to clipboard operation
zig copied to clipboard

`Io.net.test.test.listen on a unix socket, send bytes, receive bytes` randomly times out in Windows CI

Open alexrp opened this issue 1 month ago • 1 comments

test
+- test-modules
   +- test-std
      +- run test std-native-znver5-Debug 2887 pass, 55 skip, 1 timeout (2943 total)
error: 'Io.net.test.test.listen on a unix socket, send bytes, receive bytes' timed out after 30m4.367ms
failed command: "C:\\Users\\CI\\.cache\\act\\d1f615ab3aa58be1\\hostexecutor\\zig-local-cache\\o\\ac92321770b9d6f66b8510419ec4e4a4\\test.exe" "--cache-dir=C:\\Users\\CI\\.cache\\act\\d1f615ab3aa58be1\\hostexecutor\\zig-local-cache" --seed=0xe7997a2c --listen=-
Build Summary: 5147/5375 steps succeeded (224 skipped, 1 failed); 59646/60953 tests passed (1306 skipped, 1 timed out)
test transitive failure
+- test-modules transitive failure
   +- test-std transitive failure
      +- run test std-native-znver5-Debug 2887 pass, 55 skip, 1 timeout (2943 total)

The fact that it's always this test in particular makes me suspect that it's not just some Windows scheduler nonsense.

alexrp avatar Nov 19 '25 20:11 alexrp

I've been running this test repeatly until it starts to hang. And I can confirm that this test starts hanging between 50 and 500 runs of this test.

It seems to have something to do with the socket being closed on the client side. Because if you delay that, it haven't been able to make it hang with even a millisecond delay after the write before the sockets gets closed by the deferred call.

The windows syscall that starts to hang is WSAGetOverlappedResult in https://github.com/ziglang/zig/blob/master/lib/std/Io/Threaded.zig#L3984-L3990. It is called in a blocking way. That wouldn't be a big problem normally. But it seems to have issue with an AF_UNIX socket implementation in windows. If we call it in an non-blocking way, it keeps returning IO_INCOMPLETE.

Image The IO_pending calls works most of time even with an AF_UNIX socket.

I was trying to fix it by polling the non-blocking version but that doesn't work without timeout. It still keep hanging. So I'm a bit stuck how to resolve this further.

gero3 avatar Nov 22 '25 20:11 gero3