embedded-hal
embedded-hal copied to clipboard
[RFC] Use core language async primitives instead external crates
Summary
This proposal proposes to change the way how async api
does implemented. It makes async api
more clearly for use and extend.
Motivation
Our current approach to implement trait primitives as depends on nb
crate. It was actually on start of this project but after version 1.36
rust already have async traits
and methodology for work with it. So in 2018 edition of rust lang we have language words as async
, await
. The marcosses exported by nb
are conflicted with key words, we need to use r#
prefix for each macros use.
This RFC attempts to change implementation to core types
based implementation. It will be more effective and more conform rust language ideas.
Detailed design
This RFC inspired of PR #13, but after apply external crate I redesign for use core functionality because it will be better than external depends.
For example we have interface for serial:
use core::task::Poll;
/// A serial interface
pub trait Serial {
/// Error type associated to this serial interface
type Error;
/// Reads a single byte
fn read(&mut self) -> Poll<Result<u8, Self::Error>>;
/// Writes a single byte
fn write(&mut self, byte: u8) -> Poll<Result<(), Self::Error>>;
}
It use core::task::Poll
as point of async
run status. It was easy for implement and it isn't depends on nb
or another external crates.
Alternatives
Don't implement this RFC.
Uresolved questions:
- More clear examples of right use
async
code. - How we need to update depend crates?
- Do we need to rewrite
nb
prefer than rewriteasync
api ofembedded-hal
?
There are semantics attached to Poll::Pending
that did not exist on nb::Error::WouldBlock
, specifically:
When a function returns
Pending
, the function must also ensure that the current task is scheduled to be awoken when progress can be made.
Having functions that return core::task::Poll
but do not take a core::task::Context
and arrange to wake the task when necessary is likely to lead to deadlocks if they are used by systems built around core::future::Future
.
@Nemo157 I understand that problem we have but in current implementation of api i couldn't change behavior of run without more breakable changes on api. I'm planning to find solution in this situation. Thanks.
Sure, I just want to make sure this isn't missed. I have done a lot of work on async and have experimented with using it on embedded devices in https://github.com/Nemo157/embrio-rs/ including playing around with alternative trait definitions, just want to offer whatever insights I might have (I am lacking in experience with using the existing embedded ecosystem much, I was only really focused on trying it out with async).
If there's a plan for breaking changes to the trait APIs it'd be best to do them all at once, which may involve adding an additional core::task::Context
parameter to all the traits, and providing extension traits similar to how futures
is designed:
use core::{pin::Pin, task::{Poll, Context}, future::Future};
/// A serial interface
pub trait Serial {
/// Error type associated to this serial interface
type Error;
/// Reads a single byte
fn poll_read(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Result<u8, Self::Error>>;
/// Writes a single byte
fn poll_write(self: Pin<&mut Self>, byte: u8, cx: &mut Context<'_>) -> Poll<Result<(), Self::Error>>;
}
pub trait SerialExt: Serial {
fn read(&mut self) -> impl Future<Output = Result<u8, Self::Error>>;
fn write(&mut self, byte: u8) -> impl Future<Output = Result<u8, Self::Error>>;
}
It looks as real solution for me. I talk about current PR with @Disasm, he says: we can create clear path for migrate from old nb api
to new async api
through default implementation of async api
.
But it may be freeze migration from old to new api.
So the main problem I ran into when trying to do something similar in #149 was that as futures in rust are driven, the returned Poll<Result<u8, Self::Error>
needs to borrow a reference to the underlying peripheral. Otherwise it would be possible to, for example, interleave writes to the same peripheral - which the current API prevents you from doing.
One can work around this by moving the peripheral itself into the future and only returning it on completion, but this makes for an incredibly obtuse API that makes simple constructs such as loops hard.
It is this precise issue with ergonomics that was one of the motivators behind adding native async support to Rust in the first place - see here.
It may be that someone smarter than myself can find solutions to these problems, but thought I'd clarify them here. Depending on the timelines for generalized associated types and generator resume arguments, it might be worth waiting for them to to bring native async support to traits and [no_std]
respectively, before making breaking changes that would only need to be followed by more breaking changes.
The returned Poll
shouldn't need to borrow the peripheral, it should be higher level APIs like fn read(&mut self) -> impl Future<Output = Result<u8, Self::Error>>;
which borrow the peripheral in the returned Future
in order to avoid interleaved reads/writes. The Poll
returning methods are really low-level building blocks which don't need to provide full compile-time guarantees, the majority of users should be accessing them through the Future
returning functions.
One of the nice things about the split between low-level traits of Poll
functions and high-level utilities returning Future
used in futures
is that it avoids any need for GATs. And there are a lot of other issues with attempting to create core traits that return futures, see https://github.com/rust-lang/futures-rs/issues/1365 for a lot of relevant discussion.
(Generator resume arguments will not affect any public API related to Future
/async
, it will only mean that async fn
becomes usable on no_std
, at least until some very far off day when generators themselves might be stabilised).
@Nemo157 Do you have an example for a Future
executor which can run on embedded? Still trying to figure out our options; it would be kind of nice if we could spawn new futures anywhere and have the rest of the main loop be a poller for all futures. That would probably require a global future queue where interrupts and other functions can add their futures to?
@Nemo157 You are completely correct, I had a brain fart and conflated Poll<Result<u8, Self::Error>>
and impl Future<Output = Result<u8, Self::Error>>
. GATs are only needed if trying to provide traits with methods that return futures.
The majority of users should be accessing them through the Future returning functions
So is the idea that driver crates would then provide methods returning futures? How would people compose and run these without the afformentioned async
support for no_std
? You could provide the block
, await
, etc... macros that the nb
crate provides for its futures, but moving away from these was one of the motivators for the RFC?
Edit: I believe you would still need the macros nb
provides to aid the writing of the driver crates, and then would additionally need to provide an executor implementation for people to run the futures and possibly further facilities to aid in composing them.
There is https://docs.rs/embedded-executor/0.3.1/embedded_executor/, I haven't used it myself, but it was developed as part of https://gitlab.com/polymer-kb/polymer.
How would people compose and run these without the afformentioned
async
support forno_std
?
I think moving to core::future::Future
only makes sense when intended to be used with async
. The current implementation issue blocking it being used in no_std
has workarounds (core-futures-tls
using ELF thread-locals, embrio-async
for a re-implemented transform, worst case an unsound copy of core-futures-tls
that abandons thread-safety). And if there is consensus and experiments in moving some of the embedded ecosystem to async
that might provide the incentive to get someone to work on fixing the compiler to support it.
@Nemo157 So I'm a little bit confused as to what you are proposing exactly, as far as I can see it these are the following options
- This RFC gets accepted and things remain similar to how they are currently but we lose the utility
block
,await
, etc... macros that are currently provided bynb
- This RFC gets extended to provide macros similar to those provided within the
nb
crate, but named in such a way as to not collide - This RFC is blocked until
async
support comes tono_std
allowing writing and interacting with methods returningcore::future::Future
. A utility macro could be provided to facilitate wrapping theembedded_hal
methods returning poll types intocore::future::Future
types that can then be used with async - The macros within
nb
are renamed to not collide with keywords - The RFC is rejected
I think moving to core::future::Future only makes sense when intended to be used with async.
And I would argue moving to core::task::Poll
only really makes sense if you're going to move to using core::future::Future
.
I got a working proof of concept of async/await on a stm32f429 mcu using https://github.com/rust-embedded/embedded-hal/issues/172#issuecomment-561082661 approach. I used:
- embrio-rs's Executor (slightly modified not to depend on rest of embrio)
- made a simple fork of
core-futures-tls
to use astatic mut
to store theContext
- used https://github.com/rust-embedded/embedded-hal/pull/171 and ported
stm32f4xx-hal
to it.
Being not too knowledgeable about RFC process, should I share?
Being not too knowledgeable about RFC process, should I share?
Yes, please!
https://github.com/baloo/pm-firmware/commit/8ccdfd23165ddf3e0d242e773fe50a3fb272b214
Some remarks: I only implemented support for UART for now, because that's what I used to debug. I used git submodules to ensure I you get the same versions of the libs I was running. I reused the example from embrio-rs found here: https://github.com/Nemo157/embrio-rs/blob/master/examples/apps/hello/src/lib.rs
I fully support here what @Nemo157 already mentioned. If APIs return Poll
they should also provide abilities to store Waker
objects and make it possible to use peripherals without busy looping. I actually think the whole Waker
concept should also work decently well on embedded. Wakers can be implemented in flexible fashions, and certainly also in ones where interrupt handlers perform the waking. They should allow for non-busy waiting on bare-metal platforms as well as RTOS platforms if the HALs are properly implemented.
Regarding borrowing a peripheral: As @Nemo157 already mentioned, the Poll
does not need to borrow it. But if the using the peripheral is modeled as an async fn
which requires &mut self
only one task can access it at the same time. Also if peripherals are implemented in a fashion where they can only store a single Waker
, then using the peripheral from multiple places concurrently would either starve a task (by overwriting it's Waker
) or lead to continuous swapping of Waker
s back and forth. I doubt that's a huge problem in practice, since multiple tasks having ownership of peripherals is an anti-pattern anyway, and you might rather want to have a coordinator task in that case.
However for things like GPIO or UART you can also solve the multiple Waker
problem in different fashion: Instead of using
pub trait Serial {
/// Reads a single byte
fn poll_read(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Result<u8, Self::Error>>;
}
where poll_read
needs to store a Waker
, we can delegate notifications to an async ManualResetEvent
(e.g. the one in futures-intrusive), which can wake up an arbitrary amount of tasks using a fixed amount of memory. One of those objects would need to get embedded into the necessary peripheral. And the event is set from inside ISRs when the peripheral is ready. The API could then look like:
pub trait Serial {
// Obtain a reference to the associated readiness event
fn read_readiness_event(&self) -> &ManualResetEvent;
/// Try to read bytes
fn try_read(&self, mut buffer: &[u8]) -> Poll<usize>;
}
In that case the peripheral could be used along:
while !done {
serial.read_readiness_event().wait().await;
let received_bytes = serial.try_read(&buffer);
}
Multiple tasks can run that coode in parallel. Whether that's desirable or not is certainly a different question. But I think that one is best answered for each peripheral independently.
Hey all, I just wanted to mention that I have been experimenting with this model here: https://github.com/dflemstr/embedded-platform/
It contains async traits for most of the things in embedded-hal
, using a very tokio-esque future model. To get a taste, here is the i2c
module: https://github.com/dflemstr/embedded-platform/blob/master/src/i2c.rs
I think we could just copy-paste from that library, I took care to make the traits very re-usable.
I also created https://github.com/dflemstr/direct-executor which is an executor that can run without alloc
, however it doesn't support spawn
and it uses interrupts/wfi
heavily. Here's how it's used in the embedded-platform
implementation of nrf52840: https://github.com/dflemstr/embedded-platform/blob/master/platforms/nrf52840/src/lib.rs#L42-L58
hey folks, this (specifically _whether to switch from nb
to core
, how to implement async only insofar as it impacts this) is the major issue blocking us on a v1.0.0 release (#177), so i am wondering a) whether we can make a decision about this prior to actually building out the async e-h traits or are we blocked on this and b) what is the path to demonstrating / coming to consensus on this? and c) would it be useful for us to facilitate another video call to talk about this?
Related work:
I've triggered an analogous discussion at https://github.com/rust-embedded-community/embedded-nal/issues/47 and would like to update this discussion based on it (some was implied in posts but it may help to summarize):
- The issues around thread locals and other std dependencies in futures themselves are gone by now.
- For applications that want to preserve the
nb
style, a simple executor can do. In writing the demo for the abovementioned issue, I ended up with an executor very similar to @dflemstr's https://github.com/dflemstr/direct-executor/ (with differences on the level of "how to name variables" and "leave the loop using return or using break"). - Many applications now using WouldBlock profit from not having to store state because they can idempotently call the same function again. This is neat but IMO also dangerous because it's easily overlooked that a later
?
return WouldBlock can discard an earlier result's state. To my knowledge there's no satisfyingly simple solution, but maybe that's not a top priority issue either. (There is a workaround, but it is manually implementing a different state machine than we usually have the compiler generate). - The issue of when and where to store the wakers is one we should think through with a focus on how embedded-hal implementations can do it. So far, implementing a UART character receive can look at a register flag and nope out. If it is Future::poll()ed, it would suddenly need to not only store the waker, but also start messing with interrupts -- something embedded-hal implementations so far can get away without. Would it, in the interest of keeping simple things simple, be permissible for something that knows itself to take only finite time (eg. a UART transmit, or an ADC read) be OK for an implementation to not store the waker at all, but to call it right away in the poll hook and thus force the user into a spin loop? (It would then be a topic of implementation quality for the crate to decide whether to set up an interrupt, just call the waker, or even to know that the delay is so short and sure to complete that the poll function rather busy-loops for 10 CPU cycles than call the waker and take a lot more cycles for the executor to come around).
- The waker is (at least with the current options of building one) constructed using dynamic dispatch. This may put us into a similar situation as we have with formatting where we know that there's only one and a lot of const-propagation could make things easier, but it's forced through a vtable and the compiler can't help. I don't expect we do anything about this now, but we should have it on screen as a possible source of regressions compared to the more minimalistic main loops nb allows us.
(I'm not touching on the topics of how to do things through traits or GAT here; that's described well in the last post.)
Before we commit on a road that works around GATs (and can not be trivially upgraded to use GATs; not sure whether that's the case with the current proposals), we may want to talk to wg-traits as there's been recent chat that they are making good progress [edit: around https://github.com/rust-lang/rust/issues/44265#issuecomment-774956180].
I've been messing around with this and GATs are pretty much fully usable for returning futures from trait methods at this point. All the ICEs seem to be worked out.
Now that we have a clear structure for the execution models (blocking
/nb
), in principle I would be open to add a nightly-only futures
module behind a futures
(or similar) feature.
I think this would not reintroduce the problem we had with unproven
since futures
would require nightly so I think users would not enable it lightheartedly.
As per this these would need to be in a "proven stable" state before we add them here.
However, since it builds only on nightly, theoretically it would already be subject to breakage at any time so we might either be a little more flexible around this to encourage adoption or keep the "can't touch this" status quo, especially considering that GAT will probably not hit stable in a long time.
Note that "being a bit more flexible" would probably imply some kind of SemVer guarantee waiving for the futures
module until it can be built on Rust stable.
To my knowledge what comes closest to this state is embassy-traits. However, from what I see, embassy is still under heavy development. cc: @Dirbaio, @thalesfragoso
https://github.com/rust-embedded/embedded-hal/pull/285/
Would it, in the interest of keeping simple things simple, be permissible for something that knows itself to take only finite time (eg. a UART transmit, or an ADC read) be OK for an implementation to not store the waker at all, but to call it right away in the poll hook and thus force the user into a spin loop?
Yes, this is allowed by the Futures trait, although I'd argue that'd be a quite low-quality implementation. Unusable for low-power applications. It'd work okay, it's even "fair" as in all the other tasks get a turn to run before the self-waking task runs again in a multi-task executor.
Embassy does hook up the interrupts to wakers, although this means it needs a whole new infrastructure for drivers to "own" interrupts that hasn't been needed in current HALs so far.
The waker is (at least with the current options of building one) constructed using dynamic dispatch. This may put us into a similar situation as we have with formatting where we know that there's only one and a lot of const-propagation could make things easier, but it's forced through a vtable and the compiler can't help.
Embassy mitigates this by having a custom WakerRegistration type to store wakers. It has 2 impls, an embassy-only one and an executor-agnostic one (enabled by a Cargo feature). The executor-agnostic one stores the waker, the embassy-only one assumes all wakers are embassy wakers and only stores the task pointer (1 word instead of 2) and does direct calls (no vtable).
For users that want to use busyloop pseudo-executors there could even be a third "nop" implementation that stores nothing and does nothing on wake.
This can be closed now that embedded-hal-async
exists.