zig
zig copied to clipboard
Linux futex (v1 and v2) API fixes, tests and Ziggification
linux: futex v1 API cleanup
-
Use Ziggish
packed structfor flags arguments. Old:linux.FUTEX.WAITvs new:.{ .cmd = .WAIT, .private = false }. -
rename
futex_waitandfutex_wakewhich didn't actually specify wait/wake, asfutex_3argandfutex_4arg(as its the number of parameters that is different, the actualopis the second parameter). -
provide the full six-arg flavor of the syscall (for some of the advanced ops), and add packed structs for the flag-ish parameters.
-
Use a
packed unionto support the 4th parameter which is sometimes atimespecpointer, and sometimes a 32-bit value. -
Add tests that make sure the structure layout is correct and that the basic argument passing is working (no actual futexes are contended).
linux: futex v2 API cleanup
-
futex2_waitvalways takes a 64-bit timespec. Perhaps thekernel_timespecshould be renamedtimespec64now? Its used in iouring, too. -
Add Ziggish
packed structencoding for futex v2 flag parameters. -
Add very basic "tests" for the futex v2 syscalls (these found the 64-bit timespec bug).
-
Update the stale or broken comments. (I could also just delete these they're not really documenting Zig-specific behavior.)
Given that the futex2 APIs are not used by Zig's library (they're a bit too new), and the fact that these are very specialized syscalls, and they currently provide no strong benefits over the existing v1 API, it might be prudent to just delete them entirely. If you're fancy enough to build stuff on the futex API, you're more than capable of writing your own syscall wrappers ...
Hmm, not clear what's going on with aarch64-linux-release here. Maybe OOM killer victim?
This change seems to be causing tests to fail in zig v0.15.1 for me?
test
+- test-modules
+- test-std
+- run test std-native-znver3-ReleaseSmall-libc 2915/2944 passed, 3 failed, 26 skipped
error: 'os.linux.test.test.futex2_wait' failed: expected .AGAIN, found .PERM
error: 'os.linux.test.test.futex2_wake' failed: expected 0, found 18446744073709551615
error: 'os.linux.test.test.futex2_requeue' failed: expected 0, found 18446744073709551615
error: while executing test 'zig.system.darwin.macos.test.detect', the following test command failed:
./.zig-cache/o/78908d7a3430ac8c56ce5b8b8b9ab4d6/test --cache-dir=./.zig-cache --seed=0x981d8403 --listen=-
test
+- test-modules
+- test-std
+- run test std-native-znver3-ReleaseSmall-single 2885/2942 passed, 3 failed, 54 skipped
error: 'os.linux.test.test.futex2_wait' failed: expected .AGAIN, found .PERM
error: 'os.linux.test.test.futex2_wake' failed: expected 0, found 18446744073709551615
error: 'os.linux.test.test.futex2_requeue' failed: expected 0, found 18446744073709551615
error: while executing test 'zig.system.darwin.macos.test.detect', the following test command failed:
./.zig-cache/o/22f9d114fc81d6681d809ceacf4d4868/test --cache-dir=./.zig-cache --seed=0x981d8403 --listen=-
test
+- test-modules
+- test-std
+- run test std-native-znver3-ReleaseSmall 2916/2944 passed, 3 failed, 25 skipped
error: 'os.linux.test.test.futex2_wait' failed: expected .AGAIN, found .PERM
error: 'os.linux.test.test.futex2_wake' failed: expected 0, found 18446744073709551615
error: 'os.linux.test.test.futex2_requeue' failed: expected 0, found 18446744073709551615
error: while executing test 'zig.system.darwin.macos.test.detect', the following test command failed:
./.zig-cache/o/8fa39f020ebf81a5b51a8cfb5ce00fcf/test --cache-dir=./.zig-cache --seed=0x981d8403 --listen=-
test
+- test-modules
+- test-std
+- run test std-native-znver3-Debug-libc 2916/2944 passed, 3 failed, 25 skipped
error: 'os.linux.test.test.futex2_wait' failed: expected .AGAIN, found .PERM
/build/zig/src/zig-0.15.1/lib/std/testing.zig:110:17: 0x396ed1a in expectEqualInner__anon_1287536 (std.zig)
return error.TestExpectedEqual;
^
/build/zig/src/zig-0.15.1/lib/std/os/linux/test.zig:371:5: 0x397078e in test.futex2_wait (std.zig)
try expectEqual(.AGAIN, linux.E.init(rc));
^
error: 'os.linux.test.test.futex2_wake' failed: expected 0, found 18446744073709551615
/build/zig/src/zig-0.15.1/lib/std/testing.zig:110:17: 0x1378c99 in expectEqualInner__anon_42895 (std.zig)
return error.TestExpectedEqual;
^
/build/zig/src/zig-0.15.1/lib/std/os/linux/test.zig:402:5: 0x39725d0 in test.futex2_wake (std.zig)
try expectEqual(0, rc);
^
error: 'os.linux.test.test.futex2_requeue' failed: expected 0, found 18446744073709551615
/build/zig/src/zig-0.15.1/lib/std/testing.zig:110:17: 0x1378c99 in expectEqualInner__anon_42895 (std.zig)
return error.TestExpectedEqual;
^
/build/zig/src/zig-0.15.1/lib/std/os/linux/test.zig:427:5: 0x397292e in test.futex2_requeue (std.zig)
try expectEqual(0, rc);
^
error: while executing test 'zig.system.darwin.macos.test.detect', the following test command failed:
./.zig-cache/o/3aed4d36a684aa3a18c4bf51abf424cc/test --cache-dir=./.zig-cache --seed=0x981d8403 --listen=-
test
uname -r?
uname -r?
6.16.0-arch2-1
Huh, the EPERM failure (where EAGAIN is expected) is suspicious. The documentation says that EPERM should only happen with PI (priority inheriting) futexes, and the test is not intentionally testing those. Are you running in an environment or something that might be restricting syscalls somehow?
On the other hand, the futex2 syscalls aren't supported on many systems, so they haven't been getting all that much testing coverage.
I believe the specific test line that is failing is:
rc = linux.futex2_wait(&lock.raw, 2, mask, flags, null, .MONOTONIC);
Given that the lock is initialized to 1, this futex operation should return immediately (as its expecting the lock to be 2), so even if there are any flags set on the lock, I'd be a bit surprised if any of them are checked.... So my suspicion is the EPERM is coming from some other layer.
I'm running kernel v6.16.4, so I don't think its recent kernel futex2 change of any sort.
Probably unrelated but why is the Zig test failure message calling out "zig.system.darwin.macos.test.detect`"?
Oh, and I should learn to recognize "18446744073709551615". That is "0xffffffffffffffff". Which is -1, which is errno 1, which is ... EPERM.
I'm curious to know what's going on here, but one option is to just remove all the futex2 support and tests from Zig. As I noted in the PR anyone sophisticated enough to build software on the futex2 API, is more than capable of invoking a couple syscalls directly ...
I'm curious to know what's going on here, but one option is to just remove all the
futex2support and tests from Zig.
Worst case we can disable the tests, or make them skip under whatever conditions are going on here. Removing the support entirely seems excessive.
But in any case, we need to understand why this is happening; this is the first I've ever heard of it. My suspicion would be an overzealous seccomp filter or something along those lines.