nix
nix copied to clipboard
`recvfrom`, `recvmsg` and `recvmmsg` might return uninitialized addresses when using connection-mode sockets
When using connection-mode sockets like TCP, the addresses returned by recvfrom, recvmsg and recvmmsg are not initialized by the syscall.
This is how the receive operations create an IPV4-Socketaddr, with a comment which operation is unsound here:
impl SockaddrLike for SockaddrIn {
unsafe fn from_raw(
addr: *const libc::sockaddr,
len: Option<libc::socklen_t>,
) -> Option<Self>
where
Self: Sized,
{
if let Some(l) = len {
if l != mem::size_of::<libc::sockaddr_in>() as libc::socklen_t {
return None;
}
}
// Unsound if `addr` has not been initialized by the syscall, e.g. when using TCP.
//
// Currently, this is a numbers game: if we are lucky enough that the value at the
// memory region of `sa_family` is not `AF_INET`, this check will notice the unitialized
// state of `addr` and return `None`. Otherwise, this will return junk as an initialized address.
if (*addr).sa_family as i32 != libc::AF_INET {
return None;
}
Some(Self(ptr::read_unaligned(addr as *const _)))
}
}
The following test will catch this issue (but it is already based on the new API I'm working on):
#[test]
fn test_connection_mode_recv_address() {
use std::io::{IoSlice, IoSliceMut};
use nix::sys::socket::*;
let mut header = RecvMsgHeader::<Option<SockaddrIn>>::new();
const MSG: &[u8] = "Hello, world!".as_bytes();
let mut buf: [u8; 13] = [0u8; MSG.len()];
let addr = "127.0.0.1:6069".parse::<SockaddrIn>().unwrap();
// Prefill the address buffer with valid data.
{
let send_sock = socket(
AddressFamily::INET,
SockType::Datagram,
SockFlag::empty(),
Some(SockProtocol::Udp),
)
.unwrap();
let recv_sock = socket(
AddressFamily::INET,
SockType::Datagram,
SockFlag::empty(),
Some(SockProtocol::Udp),
)
.unwrap();
bind(recv_sock.as_raw_fd(), &addr).unwrap();
sendmsg(
send_sock.as_raw_fd(),
Some(&addr),
&[IoSlice::new(MSG)],
CmsgEmpty::write(),
MsgFlags::empty(),
)
.unwrap();
let res = recvmsg(
recv_sock.as_raw_fd(),
&mut header,
&mut [IoSliceMut::new(&mut buf)],
CmsgEmpty::read(),
MsgFlags::empty(),
)
.unwrap();
assert!(res.address().is_some());
}
// Now we'll use the address same buffer with an connection-mode socket
{
let send_sock = socket(
AddressFamily::INET,
SockType::Stream,
SockFlag::empty(),
Some(SockProtocol::Tcp),
)
.unwrap();
let recv_sock = socket(
AddressFamily::INET,
SockType::Stream,
SockFlag::empty(),
Some(SockProtocol::Tcp),
)
.unwrap();
bind(recv_sock.as_raw_fd(), &addr).unwrap();
listen(&recv_sock, 1).unwrap();
let res = std::thread::scope(|s| {
s.spawn(|| {
connect(send_sock.as_raw_fd(), &addr).unwrap();
send(send_sock.as_raw_fd(), MSG, MsgFlags::empty()).unwrap();
});
let fd = accept(recv_sock.as_raw_fd()).unwrap();
recvmsg(
fd,
&mut header,
&mut [IoSliceMut::new(&mut buf)],
CmsgEmpty::read(),
MsgFlags::empty(),
)
.unwrap()
});
// This assertion will fail in the current implementation (untested)
assert!(res.address().is_none());
}
}
This issue is related to #2054
Unless this can violate some other invariant of SockaddrIn, it isn't unsound. I assume you're referring to the unsoundness that results from reading uninitialized memory? In this case, though the kernel may not have written to the buffer, the compiler cannot possibly know that. So it must treat the memory as initialized. But we should probably add a check for msg_namelen after the syscall returns, however.
I'm by no means an expert on this topic, but I think this meets the exact definition of "Undefined Behaviour" as the outcome of the family check is, well, undefined.
Take a look what socket2 is doing, they zero out the entire storage before using it for this exact reason: https://github.com/rust-lang/socket2/blob/master/src/sockaddr.rs#L129
My suggestion is to introduce a new trait:
pub unsafe trait SockaddrFromRaw {
// the libc type
type Storage;
unsafe fn from_raw(
addr: *const Self::Storage,
len: libc::socklen_t,
) -> Self
where
Self: Sized;
fn init_storage(buf: &mut MaybeUninit<Self::Storage>);
}
unsafe impl SockaddrFromRaw for Option<SockaddrIn> {
type Storage = libc::sockaddr_in;
unsafe fn from_raw(
addr: *const Self::Storage,
_: libc::socklen_t,
) -> Self {
// SAFETY: `sa_family` has been initialized by `Self::init_storage` or by the syscall.
unsafe {
if addr_of!((*addr).sin_family).read() as libc::c_int != libc::AF_INET {
return None;
}
}
// Now we can assume that `addr` has been fully initialized.
Some(SockaddrIn(ptr::read(addr)))
}
fn init_storage(buf: &mut MaybeUninit<Self::Storage>) {
// The family of `Self` is `AF_INET`, so setting the family to `AF_UNSPEC` is sufficient.
let ptr = buf.as_mut_ptr() as *mut libc::sockaddr;
unsafe { addr_of_mut!((*ptr).sa_family).write(libc::AF_UNSPEC as _) }
}
}
We shouldn't need to zero it first. As far as the compiler is concerned, that FFI call always initializes the return argument. We should just check the sin_family field, the sin_len field (on OSes that have one), and the msg_namelen field of the msghdr.
I'm still not sure if I agree with you, especially if we consider more complex types like UnixAddr, that require
- a valid path in
sun_path - an NUL-byte at the end of the path.
For example, UnixAddr::path and UnixAddr::kind rely on these invariants of the struct for safety and violating these can lead to UB.
Whether we want to check the lengths is an interesting question, too. I am honestly currently in favor of not checking it at all and blindly trusting the syscalls, because
- as we already found out, the length of the netmask returned on apple systems is not the size of the struct
- the length can even exceed the size of the struct, see here for an example
- AFAIK the len-field on BSD-like systems is not part of the POSIX standard, meaning that the programmer must have the possibility to ignore it without running into UB (not 100% sure though)
- the function is unsafe, we can trust the caller to pass sensible values
Instead, I would add a # Safety section for the len method for the SockaddrLike trait saying that the returned value must be used carefully in unsafe code.
I'm still not sure if I agree with you, especially if we consider more complex types like
UnixAddr, that require* a valid path in `sun_path` * an NUL-byte at the end of the path.For example,
UnixAddr::pathandUnixAddr::kindrely on these invariants of the struct for safety and violating these can lead to UB.
Sure, different implementation of SockaddrLike might need to check different invariants.
AFAIK the len-field on BSD-like systems is not part of the POSIX standard, meaning that the programmer must have the possibility to ignore it without running into UB (not 100% sure though)
It's not part of a standard, but it's guaranteed to be present on all of the BSDs. There's no possibility that it will simply vanish at runtime.
Disregard my last comment about the lengths, these syscalls can return a length larger than our allocated storage, so a boundary check for the length is mandatory here. Sorry for the confusion.
I was already thinking about the cases where the address storage is managed by the syscall, like getifaddrs, but this is out of scope of this issue