zig
zig copied to clipboard
`std.posix` defines non-posix symbols
I think there are symbols defined in std.posix that are not actually defined (as such) by POSIX. I used the standards documents available at [1] for this.
The definition that started this was
https://github.com/ziglang/zig/blob/854e86c5676de82bc46b5c13a0c9c807596e438d/lib/std/posix.zig#L134
Should this be renamed to std.posix.in_port_t? I couldn't find any mention of port_t in the POSIX standard. The closest is in_port_t which should be defined in the arpa/inet.h and netinet/in.h headers. port_t itself seems Solaris specific, but not POSIX.
In a similar vein there are https://github.com/ziglang/zig/blob/854e86c5676de82bc46b5c13a0c9c807596e438d/lib/std/posix.zig#L135-L136
Both of which again seem to be exist somehow in Solaris, however they do not appear to be defined in POSIX.
There might be others too.
I dunno if this was done on purpose or something, but I guess these (and other similar ones) should be renamed/removed from std.posix? The straight-forward solution is a breaking change though, obviously.
[1] https://pubs.opengroup.org/onlinepubs/9699919799/download/index.html
Oh and since this is mentioned in the contributor guidelines to be relevant: I came across this while working on a Zig project https://github.com/arminfriedl/unclog. When using port_t type the build fails on Linux with glibc dynamically linked [1]. Which ofc could very well be because glibc isn't posix compliant here, but in this case I believe it might not be glibc's fault.
I'm also happy to try to contribute with code. I'd just need some help/pointers if and how to proceed.
[1] I figure the same happens with musl and everything other than Solaris too though
would you feel better if it was named std.unix instead?
I personally would much prefer std.unix, this has come up many times on the IRC channel since the move from std.os to std.posix.
Writing std.posix.epoll for example just feels fundamentally wrong to me :D
would you feel better if it was named std.unix instead?
Not really. (on a side note, I don't think this is about just feelings 😉).
For the port_* parts specifically, I think it would be best to keep it in solaris.zig only. It seems to be part of the Event Completion Framework. It is neither POSIX nor Unix imho and given the adoption of Solaris I figure it can't be considered unix standard "in practice".
Added a PR with a suggestion. This only addresses the port/Event Completion Framework part of Solaris. I guess there are other similar issues (e.g. same arguments and approach probably applies to kqueue).
port_t is used by solaris, illumos, and macos. So it's exposed in the unixy/posixy std lib API layer, where it can be shared by those three operating systems.
I don't really understand what this issue is trying to accomplish. What is the problem statement?
Edit: I think I see a path to resolution. Please see the pattern inside std.c that is used for extern functions. If these APIs are shared with other operating systems, they can use the pattern of switching on native_os. Otherwise they can use the pattern near the bottom of the file to import directly from solaris.zig, and indeed have the definitions live inside that file to indicate where they are from.
I don't really understand what this issue is trying to accomplish. What is the problem statement?
Originally, it was just a symbol in posix.zig which didn't seem to be part of POSIX. That seemed odd/wrong to me. Elevated by the fact that port_t is there but in_port_t (which is actually POSIX) isn't. So I figured something is wrong - asked on Discord, they also seemed to agree, hence the issue :)
Then it came to the unix layer discussion and maybe I'm just misunderstanding the purpose of these layers. So my understanding of these common layers (posix-y, unix-y,..) was that by using any declarations/definitions there I would be able to compile against any reasonably posix-y/unix-y OS without changes or special-casing in my code. I.e. as long as I stick to only these layers I don't need to bother with OS-specifics too much - same code compiles and runs everywhere basically the same.
So the problem then is: For port_t this seems not the case. It seems very specific to Solaris (and solaris derivates, you're right):
- Solaris/illumos - granted, the same, both are solaris-y
- MacOS [don't own one] - I think
port_thas quite different semantics there - Linux/BSDs - I don't think a
port_texists there (yet, given how generic that name is I could see that at some point they'll have a MacOS fate - exists but with different semantics)
Maybe I just saw these layers as an intersection of what is common across all the posix-y/unix-y OS, whereas they are more meant as a combination of what that exists in any of the posix-y/unix-y OS - shared (semantically or lexically) to varying degrees.
I think I see a path to resolution
Yes if I understand this correctly that seems to be a good way to me, making things more explicit :+1:
It sounds like you understand what the std lib is going for now. Admittedly the naming of std.posix is not great. I think it was also not great as std.os. You can see why we are considering std.unix now.
Naming aside, I do think the abstraction layer makes sense in a conceptual sense. It's well defined what belongs in there and what does not.
I know this is borderline-bikeshed, but perhaps std.platform? There are 90 occurrences of .windows in the latest posix.zig, WASI and Haiku aren't unix-y either.
std.unix answers one set of questions raised by calling it POSIX, but then raises another set of its own. It's a cross-platform-ish library for platform-specific functionality, std.platform is a decent gloss on that.
WASI and Haiku aren't unix-y either.
WASI seems to be partially Unix-y, though obviously nowhere near full POSIX. Haiku seems to be POSIX-compatible, as far as I can find?
If we want to keep std.posix at all, then IMO the right move is:
- Rename to
std.unixas Andrew suggested. - Remove all traces of Windows from it. Windows is not Unix-y and every historical attempt to pretend otherwise has ultimately failed in some way. Let's not add another entry to that list.
- WASI seems to have some Unix-y functionality. Keep the actual Unix-y stuff, but no emulation for the missing parts. (This extends to any other Unix-ish platform.)
What would be the point of regressing the functionality of the standard library that way?
I get that it accomplishes punishing users who want to target Windows, does it accomplish anything else?
Status quo already punishes users who want to target Windows. See #6600. std.posix is not a good Windows wrapper.
The right way to do this is to have well-designed abstractions in the standard library, probably with the ability to drop down to platform-specific details when necessary (as in e.g. std.fs). Making users interact with Windows through a POSIX layer is just not good. Like I said, history is littered with failed attempts at this, and not a single successful one.
Oh, thanks for linking #6600.
After the discussion here and how I came to understand things now, at least from my perspective, we can close this issue in favor of #6600. It seems to tackle the underlying issue at a more proper and more general level.
Please feel free to re-open if you think otherwise. Just trying to clean up my own mud ;)