tokio-core Do not hang in poll if event loop is destroyed

trafficstars

~~Partial~~ fix for #12.

Edit: updated a patch.

Loop destructor marks all loop handles dead, so any further poll requests fail (they currently panic, but should probably return an error).

Also Loop destructor unparks all tasks waiting for IO on self. It has no effect on tasks bound to self event loop, but tasks running on different executors call poll right afterwards, so they immediately fail instead of hanging.

Sep 06 '16 15:09 stepancheg

Rebased against master.

Sep 08 '16 17:09 stepancheg

Thanks for the PR @stepancheg! This seems reasonable to try to get all connections and such to return an error ASAP as soon as the associated Core is gone. I think though we don't want to do this through poll returning an error, but rather through the I/O methods themselves. That is poll would simply return Ready(()) and then the next I/O operation would fail.

Sep 09 '16 20:09 alexcrichton

That is poll would simply return Ready(()) and then the next I/O operation would fail.

Are you sure this is going to work? If poll would return ready, then the next I/O would return WouldBlock (because even if core is destroyed, file descriptors are not), and I/O implementation will try to re-register itself in the poller.

Sep 09 '16 21:09 stepancheg

Ah yeah I figure it'd look like:

When doing I/O you first always call poll, and that'd return Ready
The actual I/O would return WouldBlock
Seeing WouldBlock, you'd register interest to block the current task
This operation would fail, and that'd propagate outwards.

Sep 09 '16 22:09 alexcrichton

I think though we don't want to do this through poll returning an error ... Seeing WouldBlock, you'd register interest to block the current task This operation would fail, and that'd propagate outwards.

Which operation exactly should fall? Interest is registered inside poll, e. g. PollEvented::poll_read. So if schedule_read should fail, then poll_read should fail too.

Sep 09 '16 22:09 stepancheg

Yes what I'm thinking is that the error in the poll_read implementation (where it calls schedule_read should be ignored. That should just say the object is ready and the next I/O operation will fail with the same erorr.

Sep 12 '16 05:09 alexcrichton

I don't understand. That next "I/O" operation won't fall.

Have a look at the implementation of Stream for Receiver:

impl<T> Stream for Receiver<T> {
    fn poll(&mut self) -> Poll<Option<T>, io::Error> {
        // falling down if it returns Ready after drop of the Loop
        if let Async::NotReady = self.rx.poll_read() {
            return Ok(Async::NotReady)
        }
        // try_recv would return TryRecvError::Empty
        // because it is non-blocking queue
        match self.rx.get_ref().try_recv() {
            Ok(t) => Ok(Async::Ready(Some(t))),
            // so we fall here
            Err(TryRecvError::Empty) => {
                // need_read also won't return error
                self.rx.need_read();
                // so we return Ok(NotReady), and not Err
                // so the caller will hang
                Ok(Async::NotReady)
            }
            Err(TryRecvError::Disconnected) => Ok(Async::Ready(None)),
        }
    }
}

If we are talking about TcpStream, it is the same:

impl<E: Read> Read for PollEvented<E> {
    fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> {
        // Ready here
        if let Async::NotReady = self.poll_read() {
            return Err(mio::would_block())
        }
        // `read` returns would block
        let r = self.get_mut().read(buf);
        if is_wouldblock(&r) {
            // `need_read` does not return error
            self.need_read();
        }
        // returning WouldBlock
        return r
    }
}

and in callers of read we convert WouldBlock into NotReady by try_nb! macro. Still no I/O error.

Sep 14 '16 09:09 stepancheg

Ah so to clarify, poll_read wouldn't return the error but need_read would. Does that make sense?

Sep 14 '16 17:09 alexcrichton

Does that make sense?

Yes, it does. I don't full understand motivation behind this design decision (and it looks fragile to me), but anyway it seems to work.

I've updated the patch to return Ready when loop is dropped.

Sep 14 '16 20:09 stepancheg

Added several comments about why functions return errors, and replaced Option<Receiver> with Receiver and library-private close_receiver function.

Sep 27 '16 00:09 stepancheg

Ok this seems reasonable to me. I'm gonna hold off on merging it just yet until we get closer to the 0.2 release (due to breaking changes), however. Does that sound ok?

Sep 27 '16 19:09 alexcrichton

Rebased against master.

Jan 01 '17 15:01 stepancheg

I'd be keen to see this in a release soon, if possible. It sounds like v0.2 is waiting for futures crate to stabilize, so can this go into the next v0.1 release? Any idea when that might be please?

Aug 02 '17 12:08 jmlMetaswitch

Certainly yeah we could and this sooner! Right now it's a breaking change so we can't land it but it should be possible to add new versions of these functions and deprecate the old!

Would y'all be willing to work on a patch that does that?

Aug 02 '17 13:08 alexcrichton

@stepancheg - This is your fix. (Thank you!) Are you happy to fix it up for release?

Aug 02 '17 13:08 jmlMetaswitch

Cannot do it right now, very busy with my new job.

Aug 02 '17 13:08 stepancheg

tokio-core tokio-core copied to clipboard

Do not hang in poll if event loop is destroyed

tokio-core
tokio-core copied to clipboard