quinn
quinn copied to clipboard
feat(quinn): Pass in `&mut self` for `AsyncUdpSocket`'s `poll_recv` and re-introduce `poll_send`
This PR contains some changes to the AsyncUdpSocket trait. In short:
poll_recvnow takes&mut selfinstead of&selftry_sendis removed fromAsyncUdpSocketcreate_io_pollerand theUdpPollertrait are now replaced withcreate_senderand theUdpSendertrait- the
UdpSendertrait has atry_sendas well as apoll_sendfunction (that takesself: Pin<&mut Self>) instead of apoll_writablefunction
The final traits look like this:
pub trait AsyncUdpSocket: Send + Sync + Debug + 'static {
fn create_sender(&self) -> Pin<Box<dyn UdpSender>>;
fn poll_recv(
&mut self,
cx: &mut Context,
bufs: &mut [IoSliceMut<'_>],
meta: &mut [RecvMeta],
) -> Poll<io::Result<usize>>;
fn local_addr(&self) -> io::Result<SocketAddr>;
fn max_receive_segments(&self) -> usize;
fn may_fragment(&self) -> bool;
}
pub trait UdpSender: Send + Sync + Debug + 'static {
fn poll_send(
self: Pin<&mut Self>,
transmit: &Transmit,
cx: &mut Context,
) -> Poll<io::Result<()>>;
fn max_transmit_segments(&self) -> usize;
fn try_send(self: Pin<&mut Self>, transmit: &Transmit) -> io::Result<()>;
}
Motivation
The general idea is to encode the invariants that the runtime and socket implementations need to adhere to into rust's type system.
One of the harder things for me was to understand initially when I was working with quinn's AsyncUdpSocket was when you're allowed to overwrite the waker you're getting from poll_recv and when you can't, and similarly for the old UdpPoller.
Also, with iroh we had annoying issues with passing state from poll_writable to try_send. With a single poll_send that gives you (essentially) &mut self access, this gets much easier.
With all of these changes it's also a lot clearer when it's fine to overwrite wakers in your custom implementation. If &mut self is passed in from your poll function, then you know you can only be called from a single task that owns you at a time, thus you can safely overwrite old wakers (and even do so without any use of mutexes or fancy linked lists or anything like that).
Performance
I've tried testing the performance of this on various systems (my laptop, dedicated bare metal servers, etc.). I wasn't able to determine any differences, but at the same time, I always got very inconsistent results (both on quinn main and this PR).
I'd really appreciate any help from someone who knows how to benchmark quinn effectively. There's some more things I can try, such as fiddling with the constants mentioned in https://github.com/quinn-rs/quinn/issues/1126, but other than that I'm out of ideas.
Looking at the code, this should not have a negative performance impact. I think there's technically 1 more Arc clone for initiating connections, but other than that it should result in pretty much the same amount of work/same sequence of syscalls.