embedded-hal icon indicating copy to clipboard operation
embedded-hal copied to clipboard

io-async: Use ReadReady/WriteReady type state to avoid generating await points

Open chrysn opened this issue 5 months ago • 4 comments

A large concern moving from nb to async is that async reads lock the destination buffer across an await point: doing a let mut buf = [0; 128]; let len = await data.read(&mut buf)?; process(&buf[..len]).more().await(); turns the buf into a part of the future state machine, whereas a manually constructed state machine branch match self { AwaitRead => { let mut buf = [0; 128]; let len = data.read(&mut buf)?; DoMore(process(&buf[..len])) }} can do with a short-lived stack allocation.

The ReadReady/WriteReady traits work well in their way for the (blocking or immediate) embedded-io crate, but the state machine building process of the -async variant could use more guarantees. After ReadReady came back successfully, there will be an opportunity to read data immediately (or an error), but even if all sorts of link-time optimization were factored in into constructing the states (AIU they are not), the compiler can not see eg. through an OS's state that promises (under penalty of internal panics) that something is available immediately.

As mitigation, I suggest altering the ReadReady trait (and WriteReady analogously) such that calls can be made like this:

let ready_reader = reader.read_ready().await?;
let mut buf = [0; 128];
let len = ready_reader.read(&mut buf); // No await here!
let processed = process(&buf[..len]);
// drop(buf) happens here implicitly
more(processed).await?;

It could be implemented roughly like this:

enum ReadReadyResult<'r, R: Read> {
    Ready(&'r mut R)
    Error(R::Error)
}

impl<'r, R: Read> ReadReadyResult<'r, R> {
    // really just a convenience thing; all actual work happens in the Read
    #[inline]
    fn read(self, buf: &mut [u8]) -> Result<usize, R::Error> {
        match self {
            Ready(r) => R::read_after_ready(self, buf),
            Err(e) => Err(e),
        }
    }
}

trait Read {
    async fn read_ready(&mut self) -> ReadReadyResult<'_, Self>;

    fn read_after_ready(rrr: ReadReadyResult<'_, Self>, buf: &mut [u8]) -> Result<usize, Self::Error>;
    // maybe this could even be provided: that might require turning the reference into a pinned reference so
    // it can be polled once with an assertion that it is Ready. The async read_ready would probably manage
    // to get such a pin.
}

Note that I think it might be a good idea to just make ReadReady async-return only returns the boolean/type readiness (ie. we'd have a struct ReadIsReady(R) instead of the enum ReadReadyResult) and then waits for the actual read to flush out any error – but then, I don't know why it was implemented that way in the first place.

If the team prefers to discuss this over a PR-style implementation, I can write it out into it (but my guess is that this benefits more from a preliminary discussion over jumping right in).

chrysn avatar Aug 14 '25 09:08 chrysn

For streams that don't have an internal buffer (e.g. UART that is DMAing straight into the user's buffer) it wouldn't work. You could try making read_ready() wait until a first byte arrives, but then read_after_ready() would have to start DMA into the user's buffer which would have to be async.

For streams that do have an internal buffer, this would work, but you can already achieve this today with ReadBuf. Instead of calling read_ready() call fill_buf(). Then process the data straight from that buffer. It's more efficient even, you don't have to allocate any buffer neither on the future nor on the stack.

So I don't think this gains us anything.

Dirbaio avatar Aug 14 '25 10:08 Dirbaio

The wording of ready_ready (read under the assumption that "would not block" here means "will be Ready immediately"), so the way I'd have understood its documentation is that on unbuffered devices, this would indeed just produce bytewise data. If the intention is that a read after a read_ready would still await for some time (eg. while bytes are coming in), then indeed this issue is moot (but I'll look into how the documentation can be updated to reflect the intended behavior more accurately).

chrysn avatar Aug 14 '25 10:08 chrysn

Ah yes, that's true. "UART DMAing straight into the user's buffer" wouldn't be a compliant Read implementation then. so maybe it's okay if it doesn't work in that case

Still I feel this is a bit too complex, and you can get most of the benefits of it from ReadBuf already.

Also as I've commented before I'd prefer to "innovate" vs std::io as little as possible. The ReadReady, WriteReady traits were added as a concession for people who preferred a more nb-like API for the blocking traits (this is a legit need vs std, we don't have threads in embedded, so the innovation is sort of justified). It was then carried over to async for consistency, but I think it makes much less sense there, maybe we should even remove it.

Dirbaio avatar Aug 14 '25 12:08 Dirbaio

Not having ReadReady/WriteReady in async would make sense to me (especially in the context of being an nb workaround), and somewhat close this issue. This would nudge the ecosystem more towards the (kind'a preferable) fill_buf, possibly with very small buffers.

chrysn avatar Aug 14 '25 12:08 chrysn