Odin
Odin copied to clipboard
Provide error bool for FreeBSD syscall return value
I'm working on porting core:net to FreeBSD, and I ran into this predicament where syscall was returning only positive numbers, which isn't too helpful for detecting errors. I know very little about how LLVM works; feedback's appreciated. I used the rcx register to do the negation since it's already clobbered.
You don't need to save/restore RFLAGS, because the assembly snippet already specifies that it's clobbered, and "Odin's syscall intrinsic will always return -errno on failure" makes preserving CF non-important. Otherwise this looks fine, apologies for being lazy when I dealt with this in the past.
Hi @Feoramund, I have been working on a NetBSD port of Odin. I'm currently missing core:net just like the other BSD's.
Let me know if you would like to get some help testing or collaborating on that. I imagine the code would look and work very similar on NetBSD.
You don't need to save/restore RFLAGS, because the assembly snippet already specifies that it's clobbered, and "Odin's syscall intrinsic will always return -errno on failure" makes preserving CF non-important. Otherwise this looks fine, apologies for being lazy when I dealt with this in the past.
I found a way to negate the return value without needing to save/reload CF. The not and inc instructions don't touch CF, so I used those to construct the negative value.
Hi @Feoramund, I have been working on a NetBSD port of Odin. I'm currently missing
core:netjust like the other BSD's. Let me know if you would like to get some help testing or collaborating on that. I imagine the code would look and work very similar on NetBSD.
I have an echo server working off of syscalls right now, so hopefully won't be much longer before I can plug it all into an OS-specific file for core:net and put up a PR for testing. Any extra help poking around would be greatly appreciated.
I've hit a snag. While implementing fcntl, I found this:
F_GETOWN Get the process ID or process group currently receiving
SIGIO and SIGURG signals; process groups are returned
as negative values (arg is ignored).
[...] FCNTL(2)
In addition, if fd refers to a descriptor open on a terminal device (as
opposed to a descriptor open on a socket), a cmd of F_SETOWN can fail for
the same reasons as in tcsetpgrp(3), and a cmd of F_GETOWN for the
reasons as stated in tcgetpgrp(3).
[...] TCGETPGRP(3)
ERRORS
If an error occurs, tcgetpgrp() returns -1 and the global variable errno
is set to indicate the error, as follows:
[EBADF] The fd argument is not a valid file descriptor.
[ENOTTY] The calling process does not have a controlling
terminal or the underlying terminal device represented
by fd is not the controlling terminal.
Based on this, it's possible for fcntl to return a negative non-error value. I had concerns about this being possible, but I wasn't sure if or where it would show up.
In this one instance, if that value never overlaps with any of the possible error values, then it's not a big concern, however I think it'd be better to solve this more cleanly. I may need to re-think this. I'm wondering if a separate FreeBSD-specific implementation of syscall with a different signature would be for the best. I.e. syscall :: proc(id: uintptr, args: ..uintptr) -> (uintptr, bool) ---
I'll have to figure out how to do that.
EDIT: While reading through the source code of the compiler, I had an idea. Perhaps it might be more useful to implement an intrinsic that returns the state of the flags. I was able to write one pretty quickly.
I figured out how to modify syscall to have an alternative signature and return a boolean for the error status on FreeBSD, both for amd64 and arm64 which are considered the FreeBSD tier 1 platforms.
I figured out how to modify
syscallto have an alternative signature and return a boolean for the error status on FreeBSD
Does this make sense to adopt on other platforms as well? I always assumed it never overlapped with error values. Should this not be an issue on all Unix systems?
I figured out how to modify
syscallto have an alternative signature and return a boolean for the error status on FreeBSDDoes this make sense to adopt on other platforms as well? I always assumed it never overlapped with error values. Should this not be an issue on all Unix systems?
There is no way to distinguish certain cases of overlap on Linux (depending on architecture), so the workarounds are "Linux-isms at the syscall interface layer" (F_GETOWN_EX). Some architectures do also pass a boolean back (sparc, ppc64), but nothing that Odin currently supports has this extension.
Per the documentation:
A limitation of the Linux system call conventions on some architectures (notably i386) means that if a (negative) process group ID to be re‐ turned by F_GETOWN falls in the range -1 to -4095, then the return value is wrongly interpreted by glibc as an error in the system call; that is, the return value of fcntl() will be -1, and errno will contain the (pos‐ itive) process group ID. The Linux-specific F_GETOWN_EX operation avoids this problem. Since glibc 2.11, glibc makes the kernel F_GETOWN problem invisible by implementing F_GETOWN using F_GETOWN_EX.
One full day of research and experimentation later, I think I finally got this sorted out.
The solution is now to have a separate syscall_bsd intrinsic, which I have tested does work on NetBSD as well as FreeBSD. (Not tested on OpenBSD.)
I wiped this branch clean with a force-pushed set of commits from a new branch, since the solution had gone through so many different iterations, I didn't think the extra commits were necessary to have in the history.
To note some important changes:
- @Yawning I have removed the
+{-style clobber mechanism that was being used for FreeBSD. I couldn't find the+symbol neither in the documentation nor in LLVM header files. I have switched the syscall to always clobberR8-R10, with my understanding being that there's no way to tell if LLVM will cache a value in one of those registers there, whether or not we have used them as arguments for the syscall being irrelevant. - @andreas-jonsson I have updated your native NetBSD futex implementation to use the new
syscall_bsd. I want to note that I saw you were casting the result to anintand checking against-1. Did this actually work? I was a bit confused, because when I tried it with the new syscall - which doesn't do any sort of casting or signing - returned a regular positive error number into auintptr. I would have expected a very large number if NetBSD returns negative numbers on error. If you could take a look at the updated code, I would appreciate it. RDXnow has a clobber constraint. There's a note in the FreeBSD AMD64 assembly that says it's "return value 2" and I witnessed it being clobbered withsysctlon NetBSD.- ARM64 now has a
memoryandccclobber constraint. I found thememoryconstraint was necessary to keep LLVM from optimizing away addresses that were used in a call tosysctl.x1is also clobbered now, which I witnessed withsysctlon FreeBSD. (Untested on either OpenBSD or NetBSD ARM64.) I'm uncertain ifmemoryconstraints are necessary for other non-Linux ARM64 platforms, such as Darwin. What I've read suggests that Linux does not clobber as many registers as the BSDs.
All in all, the optimization-clobbering situation was the most pernicious issue with getting this right. Initially, everything was fine, until I did a speed build. I'm still uncertain if bugs will arise in the future due to a register not being clobbered, so I left a verbose note about my process in the comments. Documentation is difficult to find on this subject. If anyone has more expertise in this area, may we all benefit.
We'll need more tests that use native calls under an optimized build to be able to keep an eye on this sort of thing. It's a tough thing to catch if you don't know to look for it, which gives me some concern about people possibly encountering this issue in the future and not knowing the source. I've documented as much as I could on the matter, short of my exact debugger commands.
I used the following program to test the implementation:
package systest
import "base:intrinsics"
import "core:fmt"
import "core:c"
import "core:sys/info"
import "core:sync"
main :: proc() {
// AMD64 FreeBSD 14.0
// ARM64 FreeBSD 14.0
// AMD64 NetBSD 10.0
SYSCALL_FORK :: 2
SYSCALL_READ :: 3
SYSCALL_SYSCTL :: 202
CTL_KERN :: 1
KERN_HOSTNAME :: 10
KERN_VERSION :: 4
size_t :: u64 // confirmed unsigned on FreeBSD, unsure on NetBSD
fmt.println("Trying to get KERN.HOSTNAME via SYSCTL syscall...")
hostname_mib := [?]c.int { CTL_KERN, KERN_HOSTNAME }
ctlbuf: [256]byte = ---
length := cast(size_t)len(ctlbuf)
// intrinsics.syscall_bsd(SYSCALL_READ, 0xdeadbeef, 0xdeadbeef, 0xdeadbeef, 0xdeadbeef, 0xdeadbeef, 0xdeadbeef)
result, ok := intrinsics.syscall_bsd(
SYSCALL_SYSCTL,
cast(uintptr)&hostname_mib[0],
cast(uintptr)len(hostname_mib),
cast(uintptr)&ctlbuf[0], // oldp
cast(uintptr)&length, // oldlenp
0, // newp
0, // newlen
)
hostname := transmute(string)ctlbuf[:min(len(ctlbuf), cast(int)length)]
fmt.printfln("result, ok: %v, %v", result, ok)
fmt.println("hostname:", hostname)
fmt.printfln("\n%#v", info.os_version) // initialized native on FreeBSD, libc on NetBSD
// NetBSD __futex test
when ODIN_OS == .NetBSD {
SYS___futex : uintptr : 166
FUTEX_PRIVATE_FLAG :: 128
FUTEX_WAIT_PRIVATE :: 0 | FUTEX_PRIVATE_FLAG
futex := new(int)
expected := 1
error, ok2 := intrinsics.syscall_bsd(SYS___futex, uintptr(futex), FUTEX_WAIT_PRIVATE, uintptr(expected), 0, 0, 0)
fmt.println(error, ok2)
}
}
I have removed the +{-style clobber mechanism that was being used for FreeBSD. I couldn't find the + symbol neither in the documentation nor in LLVM header files. I have switched the syscall to always clobber R8-R10, with my understanding being that there's no way to tell if LLVM will cache a value in one of those registers there, whether or not we have used them as arguments for the syscall being irrelevant.
Sounds good to me. The FreeBSD stuff was entirely dry-coded, because I happened to look into system call semantics because I ran into interesting crashes the moment I started doing optimized Linux builds a looooooong time ago.
All in all, the optimization-clobbering situation was the most pernicious issue with getting this right. Initially, everything was fine, until I did a speed build.
And now we've come full-circle.
- @andreas-jonsson I have updated your native NetBSD futex implementation to use the new
syscall_bsd. I want to note that I saw you were casting the result to anintand checking against-1. Did this actually work? I was a bit confused, because when I tried it with the new syscall - which doesn't do any sort of casting or signing - returned a regular positive error number into auintptr. I would have expected a very large number if NetBSD returns negative numbers on error. If you could take a look at the updated code, I would appreciate it.
I have to check. I think I just copied the Linux code. Since it was suppose to be API compatible.
It probably is API compatible but not necessarily ABI compatible, if I had to take a guess based on everything I've seen so far.