nix `poll` / `PollFd` can no longer take `-1` as descriptors

I'm a heavy user of poll and a standard idiom is to create n entries, but not to use all of them at all times --- poll allows fd to have a value of -1 to say "ignore this entry". This means that one can simplify a lot of code surrounding poll by assuming a constant number of entries at all times, even if some of them aren't doing anything.

The change to use PollFd in 0.27 means that, as far as I can tell, this is no longer possible in nix? Here's an example of this being used in real code so it's genuinely a useful feature.

Because PollFd takes a BorrowedFd I don't think there's an obvious way to change PollFd itself but perhaps poll could usefully take &mut [Option<PollFd>]? That would feel like a fairly idiomatic Rust way of saying "there are n entries, but some of them are unused": it would allow PollFd to maintain its current semantics, but also make poll code that wants to keep a constant number of entries possible.

I appreciate this is a breaking change, but if it's something that might be welcomed, I'd be happy to raise a PR for it!

Dec 08 '23 08:12 ltratt

Is this feature available on all the platforms? I see that FreeBSD and OpenBSD have this, Linux and POSIX manual do not mention it at all.

Because PollFd takes a BorrowedFd I don't think there's an obvious way to change PollFd itself

We can make the PollFd::new() take a Option<BorrowedFd>, but this is not I/O-safe as it implements AsFd trait and should be a valid file descriptor, so I prefer the second approach:

poll could usefully take &mut [Option<PollFd>]?

I appreciate this is a breaking change,

No worries about braking changes, we will bump our minor version number in the next release

Dec 09 '23 04:12 SteveLauC

Linux does support this, but the man page you linked uses the term "negative". I didn't remember this detail, but Linux is more forgiving than the BSDs in this regard:

The field fd contains a file descriptor for an open file. If this field is negative, then the corresponding events field is ignored and the revents field returns zero. (This provides an easy way of ignoring a file descriptor for a single poll() call: simply set the fd field to its bitwise complement.)

POSIX doesn't mention this either way. So I don't know if all platforms support it but at least most of the major extant platforms definitely support it (I couldn't immediately tell if OS X does or doesn't with a quick search, and I don't have an OS X box to test on).

Dec 09 '23 08:12 ltratt

Linux does support this, but the man page you linked uses the term "negative".

Oh, I get it, same as the timeout argument

POSIX doesn't mention this either way. So I don't know if all platforms support it but at least most of the major extant platforms definitely support it (I couldn't immediately tell if OS X does or doesn't with a quick search, and I don't have an OS X box to test on).

If we are not sure about what platforms support this feature, then it is hard to change the interface.

NetBSD supports this:

If the value passed in fd is negative, the entry is ignored and revents is set to 0. (POLLNVAL is not set.)

DragonFlyBSD supports this:

fd File descriptor to poll. If fd is equal to -1 then revents is cleared (set to zero), and that pollfd is not checked.

macOS manual does not mention this

Dec 09 '23 08:12 SteveLauC

Via Ted Unangst, Darwin also skips negative fds in poll and, interestingly, has this comment in the code:

/* per spec, ignore fd values below zero */

Dec 09 '23 10:12 ltratt

poll could usefully take &mut [Option<PollFd>]?

Sounds good to me too

Dec 09 '23 16:12 asomers

If we change poll() and ppoll() to take &mut [Option<PollFd>], then we have to clone the slice and replace the None with a PollFd that has the fd set to -1:

If Rust can ensure that Some(PollFd) will have the same memory representation as PollFd, then the type of clone can be changed to Vec<Option<PollFd>>

I think null pointer optimization can happen here so a tag is not needed but I am not sure about the memory layout...

pub fn poll<T: Into<PollTimeout>>(fds: &mut [Option<PollFd>], timeout: T) -> Result<libc::c_int> {
    let mut clone: Vec<PollFd> = Vec::new();
    for opt_fd in fds {
        if let Some(fd) = opt_fd {
            clone.push(fd.clone());
        } else {
            clone.push(PollFd::new(
                // SAFETY:
                // No need to worry about the lifetime of this fd as it is not
                // a valid one, and `poll()` or `ppoll()` will skip it so we are
                // safe.
                unsafe { BorrowedFd::borrow_raw(-1) },
                PollFlags::empty(),
            ));
        }
    }

    let res = unsafe {
        libc::poll(
            clone.as_mut_ptr().cast(),
            clone.len() as libc::nfds_t,
            i32::from(timeout.into()),
        )
    };

    Errno::result(res)
}

I am thinking about if we can define a wrapper type here:

#[repr(transparent)]
pub struct MaybePollFd<'fd> {
    pollfd: PollFd<'fd>,
    _fd: std::marker::PhantomData<BorrowedFd<'fd>>,
}

impl<'fd> MaybePollFd<'fd> {
    const INVALID_FD: std::os::fd::RawFd = -1;
    pub fn is_valid(&self) -> bool {
        self.pollfd.pollfd.fd != Self::INVALID_FD
    }

    pub fn set_invalid(&mut self) {
        self.pollfd.pollfd.fd = Self::INVALID_FD;
    }
    
    pub fn new_invalid() -> Self {
        Self {
            pollfd: PollFd::new(unsafe { BorrowedFd::borrow_raw(Self::INVALID_FD)}, PollFlags::empty()),
            _fd: std::marker::PhantomData,
        } 
    }
}

impl<'fd> From<PollFd<'fd>> for MaybePollFd<'fd> {
    fn from(value: PollFd<'fd>) -> Self {
        Self {
            pollfd: value,
            _fd: std::marker::PhantomData,
        }
    }
}

impl<'fd> TryFrom<MaybePollFd<'fd>> for PollFd<'fd> {
    type Error = Errno;
    fn try_from(value: MaybePollFd<'fd>) -> std::result::Result<Self, Self::Error> {
        if value.pollfd.pollfd.fd == -1 {
            // EBADF or EINVAL?
            Err(Errno::EBADF)
        } else {
            Ok(value.pollfd)
        }
    }
}

then poll() and epoll() will take &mut [MaybePollFd], and in @ltratt's use case, you can do:

fn update_pollfds(&mut self) {
    for (i, jobslot) in self.running.iter().enumerate() {
        let (stderr_fd, stdout_fd) = if let Some(job) = jobslot {
            // Since we've asked for stderr/stdout to be captured, the unwrap()s should be
            // safe, though the Rust docs are slightly vague on this.
            let stderr_fd = if job.stderr_hup {
                MaybePollFd::new_invalid()
            } else {
                MaybePollFd::from(PollFd::new(job.child.stderr.as_ref().unwrap().as_fd(), PollFlags::POLLIN))
            };
            let stdout_fd = if job.stdout_hup {
                MaybePollFd::new_invalid()
            } else {
                MaybePollFd::from(PollFd::new(job.child.stdout.as_ref().unwrap().as_fd(), PollFlags::POLLIN))
            };
            (stderr_fd, stdout_fd)
        } else {
            (MaybePollFd::new_invalid(), MaybePollFd::new_invalid())
        };
        self.pollfds[i * 2] = stderr_fd;
        self.pollfds[i * 2 + 1] = stdout_fd;
    }
    self.pollfds[self.maxjobs * 2] = PollFd::new(self.snare.event_read_fd, PollFlags::POLLIN);
}

Dec 18 '23 12:12 SteveLauC

@SteveLauC If one can avoid the extra Vec that would be cool. FWIW I have (but currently untested while I change the program that would call this!) locally:

diff --git src/poll.rs src/poll.rs
index 0ad9f40d..6b6d4c34 100644
--- src/poll.rs
+++ src/poll.rs
@@ -197,20 +197,43 @@ libc_bitflags! {
 /// [`PollTimeout::ZERO`] causes `poll()` to return immediately, even if no file
 /// descriptors are ready.
 pub fn poll<T: Into<PollTimeout>>(
-    fds: &mut [PollFd],
+    fds: &mut [Option<PollFd>],
     timeout: T,
 ) -> Result<libc::c_int> {
+    let mut fds_map = map_fds(fds);
     let res = unsafe {
         libc::poll(
-            fds.as_mut_ptr().cast(),
-            fds.len() as libc::nfds_t,
+            fds_map.as_mut_ptr().cast(),
+            fds_map.len() as libc::nfds_t,
             i32::from(timeout.into()),
         )
     };
+    for (i, fd) in fds.iter_mut().enumerate() {
+        if let Some(ref mut fd) = fd {
+            *fd = PollFd {
+                pollfd: fds_map[i],
+                _fd: std::marker::PhantomData,
+            };
+        }
+    }
 
     Errno::result(res)
 }
 
+/// Map a slice of `Option<PollFd>` to `Vec<libc::pollfd>` where `None` is mapped to a `fd` of -1.
+fn map_fds(fds: &[Option<PollFd>]) -> Vec<libc::pollfd> {
+    fds.iter()
+        .map(|x| match x {
+            Some(PollFd { pollfd, _fd: _ }) => *pollfd,
+            None => libc::pollfd {
+                fd: -1,
+                events: 0,
+                revents: 0,
+            },
+        })
+        .collect::<Vec<_>>()
+}
+
 feature! {
 #![feature = "signal"]
 /// `ppoll()` allows an application to safely wait until either a file
@@ -226,19 +249,28 @@ feature! {
 ///
 #[cfg(any(linux_android, freebsdlike))]
 pub fn ppoll(
-    fds: &mut [PollFd],
+    fds: &mut [Option<PollFd>],
     timeout: Option<crate::sys::time::TimeSpec>,
     sigmask: Option<crate::sys::signal::SigSet>
     ) -> Result<libc::c_int>
 {
     let timeout = timeout.as_ref().map_or(core::ptr::null(), |r| r.as_ref());
     let sigmask = sigmask.as_ref().map_or(core::ptr::null(), |r| r.as_ref());
+    let mut fds_map = map_fds(fds);
     let res = unsafe {
-        libc::ppoll(fds.as_mut_ptr().cast(),
-                    fds.len() as libc::nfds_t,
+        libc::ppoll(fds_map.as_mut_ptr().cast(),
+                    fds_map.len() as libc::nfds_t,
                     timeout,
                     sigmask)
     };
+    for (i, fd) in fds.iter_mut().enumerate() {
+        if let Some(ref mut fd) = fd {
+            *fd = PollFd {
+                pollfd: fds_map[i],
+                _fd: std::marker::PhantomData,
+            };
+        }
+    }
     Errno::result(res)
 }
 }

which isn't very different than your first version. Your second version looks much different, and I think better, though!

Dec 18 '23 12:12 ltratt

It seems that all the OSes supported by Nix have this feature except for Redox:

OS	Supported
darwin	Yes
linux	Yes
android	Probably yes, since it uses Linux
freebsd	Yes
dragonfly	Yes
netbsd	Yes
openbsd	Yes
illumos	Yes
solaris	Yes
haiku	Yes
redox	No

From the source code, poll(2) on Redox is emulated with epoll, which will return an error when encountering a fd of value -1.

pub fn poll_epoll(fds: &mut [pollfd], timeout: c_int) -> c_int {
    let event_map = [
        (POLLIN, EPOLLIN),
        (POLLPRI, EPOLLPRI),
        (POLLOUT, EPOLLOUT),
        (POLLERR, EPOLLERR),
        (POLLHUP, EPOLLHUP),
        (POLLNVAL, EPOLLNVAL),
        (POLLRDNORM, EPOLLRDNORM),
        (POLLWRNORM, EPOLLWRNORM),
        (POLLRDBAND, EPOLLRDBAND),
        (POLLWRBAND, EPOLLWRBAND),
    ];

    let ep = {
        let epfd = epoll_create1(EPOLL_CLOEXEC);
        if epfd < 0 {
            return -1;
        }
        File::new(epfd)
    };

    for i in 0..fds.len() {
        let mut pfd = &mut fds[i];

        let mut event = epoll_event {
            events: 0,
            data: epoll_data { u64: i as u64 },
            ..Default::default()
        };

        for (p, ep) in event_map.iter() {
            if pfd.events & p > 0 {
                event.events |= ep;
            }
        }

        pfd.revents = 0;

        if epoll_ctl(*ep, EPOLL_CTL_ADD, pfd.fd, &mut event) < 0 {
            return -1;
        }
    }

    let mut events: [epoll_event; 32] = unsafe { mem::zeroed() };
    let res = epoll_wait(*ep, events.as_mut_ptr(), events.len() as c_int, timeout);
    if res < 0 {
        return -1;
    }

    for event in events.iter().take(res as usize) {
        let pi = unsafe { event.data.u64 as usize };
        // TODO: Error status when fd does not match?
        if let Some(pfd) = fds.get_mut(pi) {
            for (p, ep) in event_map.iter() {
                if event.events & ep > 0 {
                    pfd.revents |= p;
                }
            }
        }
    }

    let mut count = 0;
    for pfd in fds.iter() {
        if pfd.revents > 0 {
            count += 1;
        }
    }
    count
}

I think I will file a feature request issue to Redox.

Update: Issue filed

Dec 29 '23 12:12 SteveLauC

Thank you @SteveLauC -- much appreciated!

Jan 01 '24 08:01 ltratt

Hello @SteveLauC, I wondered if there'd been any luck with this issue? I was hoping to upgrade my nix dependency, and reran into this one!

May 28 '24 14:05 ltratt

Hello @SteveLauC, I wondered if there'd been any luck with this issue? I was hoping to upgrade my nix dependency, and reran into this one!

Hi, Redox still does not support this feature, but the issue has been labeled "good first issue", which means that the maintainer of Redox would accept PR for it. I will work on it, and then change the interface as discussed in Nix.

May 28 '24 23:05 SteveLauC

Thanks @SteveLauC -- you're a star!

May 29 '24 19:05 ltratt

Sorry for the delay, PR has just been sent: feat: ignore negative fd and clear its revents in poll(2)

We can work on the Nix side now:)

Jul 20 '24 13:07 SteveLauC

Thanks @SteveLauC :)

Jul 21 '24 07:07 ltratt

nix nix copied to clipboard

`poll` / `PollFd` can no longer take `-1` as descriptors

nix
nix copied to clipboard