Linux system calls may return unreachable or unexpected error codes
Zig Version
0.9.0
Steps to Reproduce
For this to work you must have a Linux kernel with CONFIG_BPF_KPROBE_OVERRIDE=y. Apparently this is the default on Arch Linux, but for NixOS I had to add:
boot.kernelPatches = [ {
name = "kprobe-bpf-override-config";
patch = null;
extraConfig = ''
FUNCTION_ERROR_INJECTION y
BPF_KPROBE_OVERRIDE y
'';
} ];
and build a kernel. Once that is done you can run:
git clone --recursive https://github.com/trailofbits/ebpfault
cd ebpfault/libraries/ebpf-common/src
git fetch https://github.com/trailofbits/ebpf-common.git
git cherry-pick b20da60eec9c5bd051c5570950e72a34b3244c30 f75a900af068a2751c85c1332dade3167114400f
cd ../../..
nix-shell -E 'with import <nixpkgs> { }; llvmPackages_8.stdenv.mkDerivation { name = "ebpfault"; buildInputs = [ cmake llvmPackages_8.llvm zig ]; cmakeFlags = ["-DEBPFAULT_ENABLE_INSTALL:BOOL=true" "-DEBPF_COMMON_ENABLE_TESTS:BOOL=true" "-DEBPF_COMMON_ENABLE_SANITIZERS:BOOL=false" "-DCMAKE_BUILD_TYPE:STRING=RelWithDebInfo"]; }'
Then in the shell:
cmakeConfigurePhase
buildPhase
cat > efault.json <<EOF
{"fault_injectors": [
{
"syscall_name": "chdir",
"error_list": [
{
"exit_code": "-EFAULT",
"probability": 100
}
]
}
]
}
EOF
cat > einval.json <<EOF
{"fault_injectors": [
{
"syscall_name": "chdir",
"error_list": [
{
"exit_code": "-EINVAL",
"probability": 100
}
]
}
]
}
EOF
cat > chdir.zig <<EOF
const std = @import("std");
pub fn main() !void {
try std.os.chdir(".");
}
EOF
zig build-exe chdir.zig
sudo ./ebpfault --config efault.json --exec ./chdir
sudo ./ebpfault --config einval.json --exec ./chdir
This is only the easiest way to get rare kernel errors to show up. As Matthew Wilcox writes, there are 70+ filesystems and also hooks such as Linux Security Modules. These can all return their own error codes under the correct conditions. Simply do git grep -ho '\-E[A-Z]*' fs or git grep -ho '\-E[A-Z]*' security in a kernel git repo to see the wide variety of error codes - nearly all error codes are in use. Furthermore a new release of the kernel may return new error codes. Figuring out the actual set of error codes each system call may return is a "Sisyphean" task that is best left unattempted.
Expected Behavior
I expect Zig to allow writing robust programs, so crashing with a panic is definitely not what I expected. I should be able to see the exact error code such as EINVAL or EFAULT when it comes up in practice, in the stack trace - the message for Unexpected doesn't count because it only shows up in debug mode. I should be able to handle the error code in a way appropriate to the application, without modifying Zig's standard library.
Actual Behavior
On running sudo ./ebpfault --config efault.json --exec ./chdir:
timestamp: 4844456200127 syscall: chdir process_id: 11026 thread_id: 11026 injected_error: -EFAULT
r15 0000000000000000 r14 0000000000000000 r13 0000000000000000
r12 ffffa821812e7f58 rbp ffffa821812e7f48 rbx 0000000000000000
r11 0000000000000000 r10 0000000000000000 r9 0000000000000000
r8 0000000000000000 rax ffffffffb62d3040 rcx 0000000000000000
rdx ffffffffffffffff rsi 0000000000000050 rdi ffffa821812e7f58
orig_rax 0000000000000000 rip ffffffffb62d3041 cs 0000000000000010
eflags 0000000000000206 rsp ffffa821812e7f38 ss 0000000000000018
thread 11026 panic: reached unreachable code
/nix/store/xnvnk2dpcig6xcx55xd8bvwdznsd0h9h-zig-0.9.0/lib/zig/std/os.zig:2625:19: 0x2349ee in std.os.chdirZ (chdir)
.FAULT => unreachable,
^
/nix/store/xnvnk2dpcig6xcx55xd8bvwdznsd0h9h-zig-0.9.0/lib/zig/std/os.zig:2610:22: 0x234916 in std.os.chdir (chdir)
return chdirZ(&dir_path_c);
^
ebpfault/build/chdir.zig:4:21: 0x22d19a in main (chdir)
try std.os.chdir(".");
^
/nix/store/xnvnk2dpcig6xcx55xd8bvwdznsd0h9h-zig-0.9.0/lib/zig/std/start.zig:553:37: 0x225e4a in std.start.callMain (chdir)
const result = root.main() catch |err| {
^
/nix/store/xnvnk2dpcig6xcx55xd8bvwdznsd0h9h-zig-0.9.0/lib/zig/std/start.zig:495:12: 0x2071fe in std.start.callMainWithArgs (chdir)
return @call(.{ .modifier = .always_inline }, callMain, .{});
^
/nix/store/xnvnk2dpcig6xcx55xd8bvwdznsd0h9h-zig-0.9.0/lib/zig/std/start.zig:409:17: 0x206296 in std.start.posixCallMainAndExit (chdir)
std.os.exit(@call(.{ .modifier = .always_inline }, callMainWithArgs, .{ argc, argv, envp }));
^
/nix/store/xnvnk2dpcig6xcx55xd8bvwdznsd0h9h-zig-0.9.0/lib/zig/std/start.zig:322:5: 0x2060a2 in std.start._start (chdir)
@call(.{ .modifier = .never_inline }, posixCallMainAndExit, .{});
^
On running sudo ./ebpfault --config einval.json --exec ./chdir:
timestamp: 4918261533081 syscall: chdir process_id: 11058 thread_id: 11058 injected_error: -EINVAL
r15 0000000000000000 r14 0000000000000000 r13 0000000000000000
r12 ffffa8218127ff58 rbp ffffa8218127ff48 rbx 0000000000000000
r11 0000000000000000 r10 0000000000000000 r9 0000000000000000
r8 0000000000000000 rax ffffffffb62d3040 rcx 0000000000000000
rdx ffffffffffffffff rsi 0000000000000050 rdi ffffa8218127ff58
orig_rax 0000000000000000 rip ffffffffb62d3041 cs 0000000000000010
eflags 0000000000000206 rsp ffffa8218127ff38 ss 0000000000000018
unexpected errno: 22
/nix/store/xnvnk2dpcig6xcx55xd8bvwdznsd0h9h-zig-0.9.0/lib/zig/std/debug.zig:458:19: 0x208048 in std.debug.writeCurrentStackTrace (chdir)
while (it.next()) |return_address| {
^
/nix/store/xnvnk2dpcig6xcx55xd8bvwdznsd0h9h-zig-0.9.0/lib/zig/std/debug.zig:115:31: 0x206aca in std.debug.dumpCurrentStackTrace (chdir)
writeCurrentStackTrace(stderr, debug_info, detectTTYConfig(), start_addr) catch |err| {
^
/nix/store/xnvnk2dpcig6xcx55xd8bvwdznsd0h9h-zig-0.9.0/lib/zig/std/os.zig:4985:40: 0x209682 in std.os.unexpectedErrno (chdir)
std.debug.dumpCurrentStackTrace(null);
^
/nix/store/xnvnk2dpcig6xcx55xd8bvwdznsd0h9h-zig-0.9.0/lib/zig/std/os.zig:2632:45: 0x23499b in std.os.chdirZ (chdir)
else => |err| return unexpectedErrno(err),
^
/nix/store/xnvnk2dpcig6xcx55xd8bvwdznsd0h9h-zig-0.9.0/lib/zig/std/os.zig:2610:22: 0x234916 in std.os.chdir (chdir)
return chdirZ(&dir_path_c);
^
ebpfault/build/chdir.zig:4:21: 0x22d19a in main (chdir)
try std.os.chdir(".");
^
/nix/store/xnvnk2dpcig6xcx55xd8bvwdznsd0h9h-zig-0.9.0/lib/zig/std/start.zig:553:37: 0x225e4a in std.start.callMain (chdir)
const result = root.main() catch |err| {
^
/nix/store/xnvnk2dpcig6xcx55xd8bvwdznsd0h9h-zig-0.9.0/lib/zig/std/start.zig:495:12: 0x2071fe in std.start.callMainWithArgs (chdir)
return @call(.{ .modifier = .always_inline }, callMain, .{});
^
/nix/store/xnvnk2dpcig6xcx55xd8bvwdznsd0h9h-zig-0.9.0/lib/zig/std/start.zig:409:17: 0x206296 in std.start.posixCallMainAndExit (chdir)
std.os.exit(@call(.{ .modifier = .always_inline }, callMainWithArgs, .{ argc, argv, envp }));
^
/nix/store/xnvnk2dpcig6xcx55xd8bvwdznsd0h9h-zig-0.9.0/lib/zig/std/start.zig:322:5: 0x2060a2 in std.start._start (chdir)
@call(.{ .modifier = .never_inline }, posixCallMainAndExit, .{});
^
error: Unexpected
/nix/store/xnvnk2dpcig6xcx55xd8bvwdznsd0h9h-zig-0.9.0/lib/zig/std/os.zig:4987:5: 0x209691 in std.os.unexpectedErrno (chdir)
return error.Unexpected;
^
/nix/store/xnvnk2dpcig6xcx55xd8bvwdznsd0h9h-zig-0.9.0/lib/zig/std/os.zig:2632:23: 0x2349a8 in std.os.chdirZ (chdir)
else => |err| return unexpectedErrno(err),
^
/nix/store/xnvnk2dpcig6xcx55xd8bvwdznsd0h9h-zig-0.9.0/lib/zig/std/os.zig:2610:9: 0x23492c in std.os.chdir (chdir)
return chdirZ(&dir_path_c);
^
ebpfault/build/chdir.zig:4:5: 0x22d1b9 in main (chdir)
try std.os.chdir(".");
^
Can you attach the used Zig code as well? https://github.com/trailofbits/ebpfault does not have Zig code and chdir is probably based on some Zig code.
Not sure what you mean. I cooked up chdir.zig based on Hello World. Using Hello World directly and hooking the write call didn't work well because the error message writes were hooked too and the panics didn't output anything. I think there was an infinite loop, panic during a panic during a panic etc. It's easier to see what's going on using a different syscall.
I think there was an infinite loop, panic during a panic during a panic etc
Can you run the program with strace and show the output behind <details> ``` strace output ``` </details> ?
strace should be able to capture at least the initial signal and might give a hint.
It's a bit tricky to strace simultaneously with ebpfault - you have to start the process and then attach to it from both of them. But it's possible. Here is the panic-during-panic behavior with strace:
$ cat hello.zig
const std = @import("std");
pub fn main() !void {
const stdin = std.io.getStdIn();
var line_buf: [20]u8 = undefined;
_ = try stdin.read(&line_buf);
const stdout = std.io.getStdOut().writer();
try stdout.print("Hello, {s}!\n", .{"world"});
}
$ zig build-exe hello.zig
$ ./hello &
[1] 83419
[1]+ Stopped ./hello
$ sudo ./ebpfault --config write.json -p 83419 &
[2] 83427
Generating fault injectors...
> write
Error list:
- 100% => -EINVAL
$ strace -p 83419 &
[3] 83443
strace: Process 83419 attached
--- stopped by SIGTTIN ---
$ fg 1
./hello
--- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=82295, si_uid=1000} ---
read(0, 1
"1\n", 20) = 2
write(1, "Hello, ", 7timestamp: 288916424653781 syscall: write process_id: 83419 thread_id: 83419 injected_error: -EINVAL
r15 0000000000000000 r14 0000000000000000 r13 0000000000000000
r12 ffffbf2fc294bf58 rbp ffffbf2fc294bf48 rbx 0000000000000000
r11 0000000000000000 r10 0000000000000000 r9 0000000000000000
r8 0000000000000001 rax ffffffffa98d8b70 rcx 0000000000000000
rdx ffffffffffffffff rsi ffffffffaa695bd9 rdi ffffbf2fc294bf58
orig_rax 0000000000000000 rip ffffffffa98d8b71 cs 0000000000000010
eflags 0000000000000202 rsp ffffbf2fc294bf38 ss 0000000000000018
) = -1 EINVAL (Invalid argument)
rt_sigaction(SIGSEGV, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x205900}, NULL, 8) = 0
rt_sigaction(SIGILL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x205900}, NULL, 8) = 0
rt_sigaction(SIGBUS, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x205900}, NULL, 8) = 0
gettid() = 83419
write(2, "thread ", 7) = -1 EINVAL (Invalid argument)
timestamp: 288916425190934 syscall: write process_id: 83419 thread_id: 83419 injected_error: -EINVAL
r15 0000000000000000 r14 0000000000000000 r13 0000000000000000
rt_sigaction(SIGSEGV, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x205900}, r12 ffffbf2fc294bf58 rbp ffffbf2fc294bf48 rbx 0000000000000000
r11 0000000000000003 r10 0000000000000000 r9 0000000000000000
r8 0000000000000001 rax ffffffffa98d8b70 rcx 0000000000000000
rdx ffffffffffffffff rsi ffffffffaa695bd9 rdi ffffbf2fc294bf58
orig_rax 0000000000000000 rip ffffffffa98d8b71 cs 0000000000000010
eflags 0000000000000202 rsp ffffbf2fc294bf38 ss 0000000000000018
NULL, 8) = 0
rt_sigaction(SIGILL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x205900}, NULL, 8) = 0
rt_sigaction(SIGBUS, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x205900}, NULL, 8) = 0
write(2, "Panicked during a panic. Abortin"..., 35) = -1 EINVAL (Invalid argument)
timestamp: 288916425740521 syscall: write process_id: 83419 thread_id: 83419 injected_error: -EINVAL
r15 0000000000000000 r14 0000000000000000 r13 0000000000000000
r12 ffffbf2fc294bf58 rbp ffffbf2fc294bf48 rbx 0000000000000000
rt_sigaction(SIGSEGV, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x205900}, r11 0000000000000000 r10 0000000000000000 r9 0000000000000000
r8 0000000000000001 rax ffffffffa98d8b70 rcx 0000000000000000
rdx ffffffffffffffff rsi ffffffffaa695bd9 rdi ffffbf2fc294bf58
orig_rax 0000000000000000 rip ffffffffa98d8b71 cs 0000000000000010
eflags 0000000000000202 rsp ffffbf2fc294bf38 ss 0000000000000018
NULL, 8) = 0
rt_sigaction(SIGILL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x205900}, NULL, 8) = 0
rt_sigaction(SIGBUS, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x205900}, NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, ~[HUP INT RT_32], [], 8) = 0
gettid() = 83419
tkill(83419, SIGABRT) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
--- SIGABRT {si_signo=SIGABRT, si_code=SI_TKILL, si_pid=83419, si_uid=1000} ---
+++ killed by SIGABRT (core dumped) +++
Aborted (core dumped)
[3]+ Done strace -p 83419
All processes have been terminated
Exiting...
[2]+ Done sudo ./ebpfault --config write.json -p 83419
So here with strace the behavior is consistent with the chdir one. But when you try to run hello directly under ebpfault with sudo ./ebpfault --config write.json --exec ./hello (without strace) it just hangs. Not sure why. This is the infinite loop I referred to (although it doesn't give 100% CPU usage or anything like that).
#11178 is an example of how this happens "in the wild" as opposed to with fault injection.
it just hangs. Not sure why.
I had a similar experience with stage1, when it was unable to come up with a stack trace and panicked in between. Though this should show up in strace log somewhere with a reoccurring pattern, if the log works as expected.
I dont understand how the tool handles file stream redirects though (especially the standard streams stdin,stdout and stderr).
https://github.com/ziglang/zig/issues/11178 is an example of how this happens "in the wild" as opposed to with fault injection.
Maybe call it use case? Ideally this would be fixed upstream by the Kernel having a proper machine readable interface.
Ideally this would be fixed upstream by the Kernel having a proper machine readable interface.
The interface is in fact really simple: any syscall can return any error code. This can be proven by modifying my steps-to-reproduce to the syscall and error code of your choice. Even if you ignore this case, Matthew Wilcox's message confirms that the kernel reserves the right to use whatever error codes they want, and they will never maintain such an interface. The only way to produce a more specific interface without playing error code whack-a-mole would be an automated analysis of the kernel code. But such an analysis would still give 50+ error codes for each syscall, so it is arguable if it would help.
Most calls look something like this:
switch (system.getErrno(res)) {
.SUCCESS => return,
.FAULT => unreachable,
.INVAL => unreachable,
.ACCES => return error.AccessDenied,
.NOMEM => return error.SystemResources,
.NOTDIR => return error.FileNotFound,
.PERM => return error.AccessDenied,
else => |err| return unexpectedErrno(err),
}
The only two codes that are handled with unreachable are:
- Invalid pointer passed to the kernel
- Syscall API used incorrectly
Unexpected error codes are handled by returning error.Unexpected, so that applications can handle even unexpected errors. Note that even in release builds, error.Unexpected is returned for unexpected error codes. The message in debug builds is only a hint that we should add a new switch case item to the standard library. And yes I am aware that for some syscalls, EINVAL is used to indicate a problem executing the syscall rather than incorrect API usage - we evaluate this on a case-by-case basis.
You could patch the Linux kernel to do anything, such as returning EINVAL for exit. However this would understandably cause undefined behavior in applications.
So, yes, the Linux kernel is assumed by the standard library to have certain kinds of behavior. If you want to reach into a lower level of abstraction, you can use std.os.linux directly and handle the usize that the kernel returns.
I expect Zig to allow writing robust programs, so crashing with a panic is definitely not what I expected. I should be able to see the exact error code such as EINVAL or EFAULT when it comes up in practice, in the stack trace - the message for Unexpected doesn't count because it only shows up in debug mode. I should be able to handle the error code in a way appropriate to the application, without modifying Zig's standard library.
Generally, zig does allow writing robust programs, and robust programs should expect the kernel to never return EFAULT from chdir. Crashing in this case could be argued to be more robust, in fact, since proceeding with an unpredictable kernel may have unintended consequences. Or presumably, we have discovered that our pointer is dangling, which is dangerous and panicking is the right move here.
If you have issues with how a specific syscall is wrapped, then you are welcome to file a bug report for that function and we can discuss it.
Unexpected error codes are handled by returning error.Unexpected, so that applications can handle even unexpected errors.
With this method there is no way to get the original error code. Meanwhile C's errno and _doserrno provide a cross-platform way to get the full error code.
You could patch the Linux kernel to do anything, such as returning EINVAL for exit. However this would understandably cause undefined behavior in applications.
One practical case this would happen is where a new kernel is released but a stable branch of Zig is in use. The current std.os design does not allow handling a new error code in the application, so the standard library would have to be patched to add the error code. This ad-hoc patching is fragile and tedious.
I'd like to be able to handle unexpected error codes specifically in the application. The behavior isn't undefined if you're expecting it to happen. But you can't handle the errors with std.os, because the error code is not accessible.
If you want to reach into a lower level of abstraction, you can use std.os.linux directly and handle the usize that the kernel returns.
It seems that given the current design using std.os.linux is the only choice. But using std.os.linux you lose all the advantages of std.os (slices, cross-platform, retrying on EINTR, etc.), so I don't think it's really an option. Rather than a lower level of abstraction, I would prefer a high-level abstraction that provides decent error codes.
If you have issues with how a specific syscall is wrapped, then you are welcome to file a bug report for that function and we can discuss it.
The reason I filed this issue was to get ahead of the game and avoid the next dozen issues about error codes. Also because enumerating the codes returned is a "Sisyphean task". If you really want some cases Zig is missing, how about EBADF, EDQUOT, ENXIO, EOPNOTSUPP, EROFS, ETXTBSY, EWOULDBLOCK in openZ() and EINTR in fsync().
One practical case this would happen is where a new kernel is released but a stable branch of Zig is in use. The current std.os design does not allow handling a new error code in the application, so the standard library would have to be patched to add the error code. This ad-hoc patching is fragile and tedious.
Fair point. I will re-open this as a proposal.
Boiling it down, I think all you need is something like this:
threadlocal var lastErrNo: E = SUCCESS;
pub fn errno(r: usize) E {
const err = system.getErrno(r);
lastErrNo = err; // store errno during conversion to Error
return err;
}
Then you can do:
os.openZ("/dev/null", os.O.RDWR, 0) catch |err| switch (err) {
error.Unexpected => switch(std.os.lastErrNo) {
.EBADF => ...
}
}
lastErrNo could even be OS-specific so you would have to write std.os.linux.lastErrNo.
It doesn't cover unreachable, but as you say I think the kernel can be assumed to have some amount of sanity.
Zig std lib will not be repeating the crime against humanity that is thread-local errno.
Well, with #2647, the full error code can be added to the Unexpected error:
pub const UnexpectedError = error{
/// The Operating System returned an error code not handled by Zig's standard library.
Unexpected: usize,
};
...
os.openZ("/dev/null", os.O.RDWR, 0) catch |err| switch (err) {
error.Unexpected => |errCode| switch(errCode) {
.EBADF => ...
}
}
You said that you preferred passing the data as an out parameter. I guess that would work too:
var errCode: usize = 0;
os.openZ("/dev/null", os.O.RDWR, 0, &errCode) catch |err| switch (err) {
error.Unexpected => switch(errCode) {
.EBADF => ...
}
}
But it would mean every function in std.os would have to take an errCode parameter.
Generally, zig does allow writing robust programs, and robust programs should expect the kernel to never return EFAULT from chdir. Crashing in this case could be argued to be more robust, in fact, since proceeding with an unpredictable kernel may have unintended consequences. Or presumably, we have discovered that our pointer is dangling, which is dangerous and panicking is the right move here.
I wanted a note on this issue that, even in the case that the only thing to do is crash, unreachable is not a reliable way to do that. It only panics under certain build conditions, so having it as an errno prong will just corrupt the program if it's hit during Fast/Small builds.
Ideally, std.posix doesn't induce crashes. It seems to me that an errno result bad enough to crash is about to get the process, if not the kernel, killed; no need to rush things. But if a crash is the most correct thing, it should be induced with @panic(), which will do so reliably, rather than a => unreachable, which will not.
There are a couple of issues open that track related things (e.g. https://github.com/ziglang/zig/issues/6925 and https://github.com/ziglang/zig/issues/6389).
I'm not sure whether to open a new issue for specific examples, or add a comment to one of these. A blocking accept hits unreachable if the listening socket is closed from a signal handler. In this case, accept(2) returns EBADF, which Zig handles as unreachable.
This code reproduces the issue in 0.14.0-dev.1694+3b465ebec.
const std = @import("std");
const posix = std.posix;
var listener: ?posix.fd_t = null;
pub fn main() !void {
listener = try posix.socket(posix.AF.INET, posix.SOCK.STREAM, posix.IPPROTO.TCP);
try posix.setsockopt(listener.?, posix.SOL.SOCKET, posix.SO.REUSEADDR, &std.mem.toBytes(@as(c_int, 1)));
try posix.listen(listener.?, 128);
std.posix.sigaction(std.posix.SIG.INT, &.{
.handler = .{ .handler = shutdown },
.mask = std.posix.empty_sigset,
.flags = 0,
}, null);
_ = try posix.accept(listener.?, null, null, 0);
}
fn shutdown(_: c_int) callconv(.C) void {
if (listener) |l| {
posix.close(l);
}
}