network
network copied to clipboard
Incorrect handling of `address` argument in recvfrom()
Context:
recvfrom() stores the source address of a received package in the sockaddr structure pointed to by the address pointer and the length of the address is stored in the memory pointed to by the address_len argument.
Problem description:
However, not all protocol provide source addresses, e.g. AF_UNIX does not.
In this case, the contents of the address parameter is "unspecified" [1] and should not be inspected. I have not found it in the specification, but in practice I have observed that at least on Linux address_len is set to 0
recvBufFrom does not check address_len at all and instead tries to parse the contents of address directly [2]. This leads to an error because it interprets the zeroed memory as AF_UNSPEC which is not supported. The error is ignored and getPeerName is called, which also fails.
The solution is to peek ptr_len and check if it is 0, and if so to return an "unset" address. I'm not sure how best to represent such an address, perhaps SockAddrUnix [] would do the trick?
Compare also how Rust handles the situation [3]
I have included a minimal code example that reproduced the problem [4]
- [1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/recvfrom.html
- [2] https://github.com/haskell/network/blob/26e9d3cd57587e3f102d7bf722c18edc3e7172ae/Network/Socket/Buffer.hsc#L114-L126
- [3] https://doc.rust-lang.org/src/std/os/unix/net/addr.rs.html#109-113
- [4] https://gist.github.com/Philonous/7fd9d6620e631325f6bd2fffaba0cc2e
@Philonous As you think, SockAddrUnix "" would be the best workaround.
Would you kindly send a PR?
@Philonous Gentle ping.
I'm working on it. This isn't as obvious as I first thought.
- Since 3.0
Socketdoesn't container the address family any more, so I can't check it to return the correct address type whenaddrlenis 0. - I'm thinking that just returning
SockAddrUnix ""wheneveraddrlenis 0 regardless of the socket type might work, it's at least not worse than the situation before.
However, there's another problem: https://github.com/haskell/network/blob/09f30fa0ecc3332f28d006a72173cf6d422edffb/Network/Socket/Types.hsc#L1192 is incorrect:
- POSIX doesn't specify that
sun_pathis null-terminated (though at least on Linux and *BSD apparently it is) - On Linux, the terminating null byte can sometimes be truncated [1] even for correctly sized buffers (
sizeof(sockaddr_un))- This doesn't affect the existing code because it allocates and zeroes
sizeof(sockaddr_storage) = 128 bytes[2][3], so there's always a null byte aftersun_path. ButSocketAddress' is publicly exported and can be implemented for a new type, e.g. one that passes a buffer of sizesizeof(sockaddr_un) bytes, which should work.
- This doesn't affect the existing code because it allocates and zeroes
- Abstract paths can contain null characters, so the only way to handle them properly is to know the length of
sun_path- This is handled correctly in other places[4], so it's surprising that
peekSocketAddresscompletely fails
- This is handled correctly in other places[4], so it's surprising that
As far as I can see, the interface peekSocketAddress :: Ptr sa -> IO sa can't work, because it needs addrlen to correctly decode sockaddr_un. It might also be a good idea to handle the case where addlen = 0 there.
- [1] https://man7.org/linux/man-pages/man7/unix.7.html see section BUGS:
However, there is one case where confusing behavior can result: if 108 non-null bytes are supplied when a socket is bound, then the addition of the null terminator takes the length of the pathname beyond sizeof(sun_path). Consequently, when retrieving the socket address (for example, via accept(2)), if the input addrlen argument for the retrieving call is specified as sizeof(struct sockaddr_un), then the returned address structure won't have a null terminator in sun_path.
- [2] https://github.com/haskell/network/blob/09f30fa0ecc3332f28d006a72173cf6d422edffb/Network/Socket/Types.hsc#L1021-L1023
- [3] https://github.com/haskell/network/blob/09f30fa0ecc3332f28d006a72173cf6d422edffb/Network/Socket/Types.hsc#L1010
- [4] https://github.com/haskell/network/blob/09f30fa0ecc3332f28d006a72173cf6d422edffb/Network/Socket/Types.hsc#L1114-L1117
Since 3.0 Socket doesn't container the address family any more, so I can't check it to return the correct address type when addrlen is 0.
Why don't you use getSocketName to get its SocketAddress type?
As far as I can see, the interface peekSocketAddress :: Ptr sa -> IO sa can't work, because it needs addrlen to correctly decode sockaddr_un. It might also be a good idea to handle the case where addlen = 0 there.
Can't we use sizeOfSocketAddress or sizeOfSockAddr to get addrlen?
Why don't you use getSocketName to get its SocketAddress type?
- AFAIK Windows doesn't allow
getsocketname()on unbound sockets. [1] Windows doesn't haveAF_UNIXeither, so the code could be conditionally included. - POSIX explicitly mentions that
getsocketname()on unbound sockets returns unspecified data [2]. Although on Linux it seems to at least return the correct address family.
Can't we use sizeOfSocketAddress or sizeOfSockAddr to get addrlen?
We already have addrlen, recvfrom() etc. are passing it back to us. The problem is that we need it inside peekSocketAddress to correctly decode sockaddr_un.
sizeOfSocketAddress needs an already decoded address, so we can't use it inside peekSocketAddress.
- [1] https://learn.microsoft.com/en-us/windows/win32/api/winsock/nf-winsock-getsockname
WSAEINVAL | The socket has not been bound to an address with bind, or ADDR_ANY is specified in bind but connection has not yet occurred.
- [2] https://pubs.opengroup.org/onlinepubs/9699919799/
If the socket has not been bound to a local name, the value stored in the object pointed to by address is unspecified.
Let's introduce a breaking change:
peekSocketAddress :: Ptr sa -> Int -> IO sa
And let's release it with a new major version.
I've been working on this, but I've been stuck on
https://github.com/haskell/network/blob/26e9d3cd57587e3f102d7bf722c18edc3e7172ae/Network/Socket/Buffer.hsc#L125
I think the idea here is that for connected sockets, recvfrom() and recvmsg() don't return the remote address, so the getpeername() is there for convenience.
This assumes that recvfrom() will only not return an address when the socket is connected, which is false for AF_UNIX (this is what this ticket is about in the first place). In general, this would be false whenever we receive a packet from an unnamed socket. So I've been trying to figure out if this can happen besides with AF_UNIX. It could in theory happen with socketpair(), but none of the platforms I checked (Linux, FreeBSD, OpenBSD, NetBSD, Solaris) support socketpair() with anything but AF_UNIX, windows doesn't have socketpair().
Next I tried to figure out if getsockname() is guaranteed to return sensible data even for unbound sockets at least on *nix, since POSIX specifically states that it is "unspecified". On Windows it seems to be forbidden anyway, so it would have to be #ifdef-ed out. Linux just returns an empty address, which is fine, FreeBSD returns the correct address family and a bunch of garbage, but it looked like it should at least be safe to read, but I'm not entirely certain. I did not check on the other platforms for lack of time and easy access
I think this is brittle. A slightly better idea (in my opinion) might be to store the address family in the Socket value like before 3.0.
But I think the morally correct approach would be to change recvFrom's etc type signature from
recvFrom :: SocketAddress sa => Socket -> Int -> IO (ByteString, sa)
to
recvFrom :: SocketAddress sa => Socket -> Int -> IO (ByteString, Maybe sa)
and remove the call to getpeername().
It would remove the possibility of failure and more closely resemble the the underlying API. Unfortunately, it would be a breaking change and presumably affect a lot of existing code.
I would suggest to keep recvFrom as is and provide a new safeRecvFrom which satisfies you.
We started a project for version 3.2. Breaking changes are welcome now.