embassy
embassy copied to clipboard
Support for Dynamically Allocated Executor
The current statically allocated task system works well for many projects, but I’ve encountered challenges when trying to implement more complex functionality:
1. Static Task Allocation Tightly Couples Code with Embassy
Currently, all tasks must be defined within the root application using Embassy, making it impossible for libraries to create tasks. In my project, the core logic is housed in a separate crate that is hardware-agnostic and independent of any executor, allowing me to test the logic on a PC without hardware. However, due to the limitation of not being able to create tasks within the crate, I’ve had to rely on join! macros to handle over 30 futures, which has led to significant performance issues (reference).
2. Static Task Allocation Cannot Handle Generics (#2454)
From what I understand, static tasks inherently cannot support generics because the task size must be known at compile time. In my case, the crate depends on generic drivers passed in at runtime, so even if Embassy allowed task definitions within a crate, generics would still present a limitation.
3. Inefficient Memory Usage When Only One Task Runs at a Time
Consider a device with multiple operating modes, where only one mode is active at any given time. Defining separate tasks for each mode would unnecessarily reserve memory for all tasks, even though only one will run. While combining all modes into a single task and using conditionals can mitigate this (allowing the Rust compiler to optimize task size), this solution becomes unmanageable when dealing with more complex task combinations.
Downsides of Dynamic Task Allocation
While dynamic allocation introduces common issues such as heap fragmentation and unpredictable performance, the flexibility it provides could solve the problems I’m facing with static tasks.
Proposal & Implementation
- The task arena in Embassy already functions somewhat like a dynamic executor, albeit with (intentional?) limitations. Technically, adding support for fully dynamic task allocation shouldn’t be too difficult.
- I recognize that part of Embassy's design philosophy is the guarantee that "if it compiles, it works" and a dynamic executor might compromise that simplicity. I also understand that this feature might not align with Embassy's long-term goals.
- Regardless, I’m interested in exploring this feature for my project. Any guidance on where to start in the codebase or potential pitfalls would be greatly appreciated.
As I know, one of the design goals of Embassy is no dynamic memory allocation. In other words, No alloc crate.
Maybe add alloc features?
What I typically do is to have library fn's expose some toplevel 'task' fn, and then have an equivalent non-generic variant in the main application with the embassy task macro. In the tests I used join!() macros or whatever appropriate (sometimes tokio or futures-rs executors). Yes, there's some duplication, but in my experience the benefits outweighs the downsides for the embedded applications I've written so far.
I think the best way forward here is that you provide some proof of concept of this if you're interested to work on it. I imagine that some parts of embassy-executor can be re-used. Whether it's better to use alloc features or a separate embassy-alloc-executor crate for this, depends a bit on how it ends up.
@lulf thanks, I'll give proof of concept a try
Related to the above.
I'm developing for esp32s3 and I'm very short on memory to the point where I can't extend my application any longer and yet I have lots of PSRAM available through alloc. Embassy arena now requires close to 64K which is required by other libraries that can use only internal ram (those for technical reasons can't use the PSRAM).
Is there a way to get the arena to reside, even as a single large chunk as it expects to, on an alloc memory block?
It also looks like with nightly feature the allocations are done in some way one by one, I couldn't figure the code exactly so might be wrong. Is it possible to hook into those allocations to get them to use alloc?
I've completed a proof of concept in my personal fork.
The current poc gives the executor the ability to spawn permanently running tasks on the heap. The reason to limit the tasks to be permanently running are:
- It's actually very hard to deallocate an embassy task. The executor expects all the tasks to exist forever and take advantage of that in the inner workings of the executor.
- Permanently running tasks already fixes all 3 points i mentioned in the initial issue.
- By only spawning tasks at the beginning of the program (which I assume is the most common use case for a permanently running task), the effect of heap fragmentation and unpredictable performance can be minimized.
Example Code
https://github.com/PegasisForever/embassy-alloc-executor-example/blob/dc3d4e92bbb30b4d3b45ebac9f2f5fb4c9eb6f02/src/main.rs#L23-L49
I was referred to
- https://github.com/card-io-ecg/card-io-fw/blob/main/embassy-alloc-taskpool/src/lib.rs
- https://github.com/card-io-ecg/card-io-fw/blob/main/macros/src/lib.rs
Which together provide some kind of replacement for embassy task and allocation of memory for tasks.
Haven't looked into it yet since I was able to free memory in other ways, but it seemed interesting in case anyone wants to explore.
I was referred to
- https://github.com/card-io-ecg/card-io-fw/blob/main/embassy-alloc-taskpool/src/lib.rs
- https://github.com/card-io-ecg/card-io-fw/blob/main/macros/src/lib.rs
Which together provide some kind of replacement for embassy
taskand allocation of memory for tasks.Haven't looked into it yet since I was able to free memory in other ways, but it seemed interesting in case anyone wants to explore.
I was about to link these as I didn't really have to touch those bits for more than a year at this point. There is no need to modify embassy-executor, and it's possible to mix statically and dynamically allocated tasks. Also I believe it's not necessary to require that tasks never return - they aren't freed, but the storage may be reused.
I was about to link these as I didn't really have to touch those bits for more than a year at this point. There is no need to modify embassy-executor, and it's possible to mix statically and dynamically allocated tasks. Also I believe it's not necessary to require that tasks never return - they aren't freed, but the storage may be reused.
heh - you were the one to direct me there, what a small world (more like small github) 😄 I didn't notice you also wrote it.
This solution may not modify embassy-executor but I think it does duplicate quite some code from there, so if something would change in embassy-executor in those areas it will stop working, no? So it would be better if embassy would allow to hook into these parts and only supply the allocator.
I'm using embassy-executor with nightly feature, so I don't need to specify the arena size.
Doesn't it allocate the memory for tasks dynamically in such case?
If it does, where does it get the memory from? using alloc?
If it doesn't, then what's the mechanism it's using to reserve memory?
it's explained here https://docs.embassy.dev/embassy-executor/git/cortex-m/index.html#task-arena
with nightly disabled, it allocates one single big static for the arena and allocates tasks in it the first time they're spawned. It doesn't use alloc, it uses its own internal arena implementation.
with nightly enabled, it allocates one static for each task of the exact size the task needs.
Both are statics in the end, neither uses alloc.
My understanding is,
The official design of Embassy seems to favor static stack allocation, with a single global stack. For async tasks, each task has a future that manages its state and corresponding variables. The characteristics of these futures make them like a stack, but Rust async prohibits recursion, meaning the compiler can always determine the maximum size of the future at compilation time.
As for generics, they do not affect the compiler’s ability to calculate the maximum future size, so they aren’t a major issue in this context.
This design means that production code will not have recursion. From an engineering perspective, recursion can often be replaced with loops, iteration, and other techniques. Think back to the DOS era with only 64K of memory—we managed things in a similar way.
This approach allows us to determine at compile time whether we have enough memory, which is crucial because stack overflows are notoriously hard to debug in production.
Overall, I think this trade-off makes sense on my side. And because embassy is a crate, it is open for engineers to implement other executor allocators. Feel free to implement yours and share it out. I guess Embassy officially will keep their understanding of the static allocation.
Just my 2-cents, if my understanding is wrong, please feel free to let me know. Thanks!