tokio
tokio copied to clipboard
io: make `EPOLLERR` awake `AsyncFd::readable`
Ref: https://github.com/tokio-rs/tokio/issues/4349
Motivation
Currently, if a user is awaiting on a AsyncFd
becoming readable and we
get a polling error (which may happen in some cases, see the referenced
issue), the awaiter will never be woken. Ideally, they would be notified
of this error in some way.
Solution
The solution here involves communicating to the Readiness
object that
there was an error, and when that happens, the Readiness
future will
return a synthetic error that boils down to "there was a polling error".
If the user wants to find more information about the underlying error,
they would have to find some other way (not clear how yet? as pointed
out in the issue, for sockets one could use SO_ERROR
).
Open questions
- How do we go about testing? My manual tests involved using the example piece of code in #4349 but that requires
sudo
which might not be great for automated testing. It's also kind of an involved setup. I will have to look into this. -
EPOLLERR
also triggersis_write_closed
in the mio event. I don't know if we should notify writers of an error or just of "write closed". - More things that I've written down here and there in the source code
I've implemented the functionality (epollerr awaking waiters) for the AsyncRead
slot as well (also for the AsyncWrite
one actually, but that one is tricky because ready.is_error()
implies ready.is_write_closed()
, which already waiters with a write interest). This is not part of the original issue but it makes sense to me to be consistent? We can do it in separate PRs if you prefer though.
My current plan is to find a satisfactory way to test these changes. It doesn't seem immediately obvious how to do it, because the only way I know to trigger a EPOLLERR
is through a setup that generally requires root permissions (the one in the issue). Once I write some tests, I will think through the TODOs I've left in the code, and clean-up the code I've already written (which isn't great yet).
Please let me know if you have any thoughts
I think I've found a way to test this automatically! By searching EPOLLERR
on https://github.com/torvalds/linux (who would have guessed this would actually work) I've found https://github.com/torvalds/linux/blob/46f4945e2b39dda4b832909434785483e028419d/fs/eventfd.c#L175-L176 Indeed, the man page for eventfd (2) mentions:
If an overflow of the counter value was detected, then select(2) indicates the file descriptor as being both readable and writable, and poll(2) returns a POLLERR event.
So this looks very promising! I will try this over the following days
So this looks very promising!
That was wrong. It's true that if the counter's value is 0xfffffffff then polling on the eventfd will return EPOLLERR, but it's very hard to write that value to the eventfd in the first place. According to the man page
As noted above, write(2) can never overflow the counter. However an overflow can occur if 2^64 eventfd "signal posts" were performed by the KAIO subsystem (theoretically possible, but practically unlikely)
I will search for other ways to trigger EPOLLER
I've addressed a few of the TODO
s that I had left in my changes, but the main problem with this PR remains: automated testing.
I have not found any way to reliably reproduce an EPOLLERR
that doesn't require root (or CAP_SYS_ADMIN
). I have checked in two example programs that can be run with sudo
that can reproduce EPOLLER
, just to show what kind of manual testing I'm doing. Even then, those programs do not cover all the changes that I've made.
Before investing more in these clunky manual tests, I would like to hear if you have any tips on how to test this. Some ideas that I have:
- Allow some tests to use
sudo
and have them run in a VM in CI that allows us to do this. If we had something like this, I would put more time in improving the tests I currently have (using fuse to trigger an epollerr) and make them more amenable to run in an automated way. - Mocking. I think this would require a lot of very intrusive changes in the
io
module just to be able to simulate amio
giving us ais_error()
event. - Something else?
Maybe I can add unit tests in the schedule_io
module. I know unit tests are generally discouraged in this project but they could be worth it here. I'll have a look later
If adding unit tests will allow you to test it, then that's fine.
(this is still very much WIP)
I think I might be able to unit test part of the changes. I've written one such test in https://github.com/tokio-rs/tokio/pull/4444/files#diff-585bbdb06d4f023035fc2520bc464acb733f7f212921936f13c9873798cbfc92R602 , and will keep trying to add more tests over the following days
Ok. Let me know when you need additional review.
I'm running into the same issue. And I think an ErrorKind::Other
is a bad way signal this, IMO a simple wake-up should be fine? (Perhaps with an is_error()
check on the AsyncFd
?). Alternatively, for some cases such as fuse, being able to wait explicitly for EPOLLERR
would also be useful.
In fact, other EPOLL*
flags can also be of particular interest. Eg. to poll freezer cgroup.events
you want to explicitly wait for EPOLLPRI
while EPOLLIN
is always true.
Maybe we can expose a bit more of the io driver machinery so we can make custom registrations?
I'm seeing the same issue. I've a readiness
call on an network-over-USB device. If the USB device is removed, the kernel driver will wait for all handles/sockets to the device to be closed which will never happen because tokio is waiting for a future that will never be woken (SO_ERROR is set to ENETDOWN). In turn this locks the entire kernel USB subsystem, no USB devices can be add or removed until the socket is closed and the kernel resources freed to the kernerl.
Since there is no easy workaround, has there been any progress on this PR?
doing some cross-linking here. In https://github.com/tokio-rs/tokio/pull/5566, Interest::PRIORITY
and AsyncFd::ready
have been added. On linux (and android, I guess) that makes it possible to wait for EPOLLPRI
(cc @Blub)
I've now opened an additional PR that reports error readiness, see #5781. That means that with AsyncFd::ready
you can check for read and/or error readiness, which I suspect provides a workaround for https://github.com/tokio-rs/tokio/issues/4349.
What it doesn't do yet is interact with the polling functions (e.g. poll_read_ready
). My understanding is that those poll functions are useful when implementing custom Async{Read, Write}
instances for system primitives?
I'm going to close this as solved by #5781. But let me know on the issue if I've missed something.