bevy
bevy copied to clipboard
Fallible systems
Objective
Error handling in bevy is hard. See for reference https://github.com/bevyengine/bevy/issues/11562, https://github.com/bevyengine/bevy/issues/10874 and https://github.com/bevyengine/bevy/issues/12660. The goal of this PR is to make it better, by allowing users to optionally return Result from systems as outlined by Cart in https://github.com/bevyengine/bevy/issues/14275#issuecomment-2223708314.
Solution
This PR introduces a new ScheuleSystem type to represent systems that can be added to schedules. Instances of this type contain either an infallible BoxedSystem<(), ()> or a fallible BoxedSystem<(), Result>. ScheuleSystem implements System<In = (), Out = Result> and replaces all uses of BoxedSystem in schedules. The async executor now receives a result after executing a system, which for infallible systems is always Ok(()). Currently it ignores this result, but more useful error handling could also be implemented.
Aliases for Error and Result have been added to the bevy_ecs prelude, as well as const OK which new users may find more friendly than Ok(()).
Testing
- I have done next to no testing.
Showcase
The following minimal example prints "hello world" once, then completes.
use bevy::prelude::*;
fn main() {
App::new().add_systems(Update, hello_world_system).run();
}
fn hello_world_system() -> Result {
println!("hello world");
Err("string")?;
println!("goodbye world");
OK
}
Migration Guide
This change should be pretty much non-breaking, except for users who have implemented their own custom executors. Those users should use ScheduleSystem in place of BoxedSystem<(), ()> and import the System trait where needed. They can choose to do whatever they wish with the result.
Current Work
- [x] Fix tests & doc comments
- [ ] Write more tests
- [x] Add examples
- [X] Draft release notes
Draft Release Notes
As of this release, systems can now return results.
First a bit of background: Bevy has hisotrically expected systems to return the empty type (). While this makes sense in the context of the ecs, it's at odds with how error handling is typically done in rust: returning Result::Error to indicate failure, and using the short-circuiting ? operator to propagate that error up the call stack to where it can be properly handled. Users of functional languages will tell you this is called "monadic error handling".
Not being able to return Results from systems left bevy users with a quandry. They could add custom error handling logic to every system, or manually pipe every system into an error handler, or perhaps sidestep the issue with some combination of fallible assignents, logging, macros, and early returns. Often, users would just litter their systems with unwraps and possible panics.
While any one of these approaches might be fine for a particular user, each of them has their own drawbacks, and none makes good use of the language. Serious issues could also arrise when two different crates used by the same project made different choices about error handling.
Now, by returning results, systems can defer error handling to the application itself. It looks like this:
// Previous, handling internally
app.add_systems(my_system)
fn my_system(window: Query<&Window>) {
let Ok(window) = query.get_single() else {
return;
};
// ... do something to the window here
}
// Previous, handling externally
app.add_systems(my_system.pipe(my_error_handler))
fn my_system(window: Query<&Window>) -> Result<(), impl Error> {
let window = query.get_single()?;
// ... do something to the window here
Ok(())
}
// Previous, panicking
app.add_systems(my_system)
fn my_system(window: Query<&Window>) {
let window = query.single();
// ... do something to the window here
}
// Now
app.add_systems(my_system)
fn my_system(window: Query<&Window>) -> Result {
let window = query.get_single()?;
// ... do something to the window here
Ok(())
}
There are currently some limitations. Systems must either return () or Result<(), Box<dyn Error + Send + Sync + 'static>>, with no in-between. Results are also ignored by default, and though implementing a custom handler is possible, it involves writing your own custom ecs executor (which is not recomended).
Systems should return errors when they cannot perform their normal behavior. In turn, errors returned to the executor while running the schedule will (eventually) be treated as unexpected. Users and library authors should prefer to return errors for anything that disrupts the normal expected behavior of a system, and should only handle expected cases internally.
We have big plans for improving error handling further:
- Allowing users to change the error handling logic of the default executors.
- Adding source tracking and optional backtraces to errors.
- Possibly adding tracing-levels (Error/Warn/Info/Debug/Trace) to errors.
- Generally making the default error logging more helpful and inteligent.
- Adding monadic system combininators for fallible systems.
- Possibly removing all panicking variants from our api.
Even without the actual error handling benefits this provides, just having a more blessed way to use ? in systems will be really nice. I know we can just pipe the Result with current systems, but this will hide that bit of extra boilerplate. This also pairs nicely with making more APIs return Result instead of Option, and also makes panicking variants less important (possibly even removable TBH).
It was mentioned in Discord, but I'll include it here for posterity: with fallible systems getting first-class treatment, there may be room to consider removing the panicking variants of certain functions (e.g., Query::get_entity and Query::entity), since the choice of behaviour could be controlled by a system error handler. This would be a large DX win, since the "proper" methods would get the shorter names, and it'd reduce the API surface area.
there may be room to consider removing the panicking variants of certain functions
That's in line with the third point Cart proposed in https://github.com/bevyengine/bevy/issues/14275#issuecomment-2223708314. He indicated then that it was important to land all the related changes in a single release cycle, and I agree. This PR provides his (1), what (2) and (3) look like is up to @alice-i-cecile and the other designated ecs experts.
as well as const
OKwhich new users may find more friendly thanOk(()).
Why? This just makes the code more jarring compared to the rest of the Rust ecosystem, and more cognitive load to switch between returning nothing vs. returning some value.
It'd make sense if it's something useful like https://docs.rs/anyhow/latest/anyhow/fn.Ok.html
This just makes the code more jarring compared to the rest of the Rust ecosystem
In this I am trying to defer to my understanding of Cart's preferences. He uses a const in the linked issue, and I believe has expressed that Ok(()) is sort of confusing and cumbersome. No strong preference here from me really.
It'd make sense if it's something useful like https://docs.rs/anyhow/latest/anyhow/fn.Ok.html
Functionally this is identical to anyhow's Ok function. The difference is since in Bevy a ScheduleSystem can only output an (), the function can instead just be a constant.
It's a problem with Rust's syntax at the end of the day. With the addition of the Try trait, perhaps we'll get an automatic wrapping as a bit of syntactic sugar, but that's up to the language team at the end of the day.
A previous version of this feature (https://github.com/bevyengine/bevy/pull/8705) was closed due to incoherent stack traces. This issue was also brought up in discord. It seems similar to https://github.com/bevyengine/bevy/issues/8638 with logs from systems not including the system name properly.
I think these issues can be resolved (just as they were in anyhow) by defining a new error type and hiding instrumentation/stack-tracing/call-tracking in the From implementation.
This isn't an issue for this pr, since we don't do anything with the returned results; it's just control flow. But it would be good to tackle in follow up.
@NthTensor let me know when you're done the outstanding tasks and take this out of draft. I'll do a final pass and then merge + create follow-up issues for this.
as well as const OK which new users may find more friendly than Ok(()).
Can you please cut this from this PR? I kinda like it, but it's pointlessly controversial here :)
@alice-i-cecile requesting removal of the M-Needs-Release-Note label (added in pr description because we don't have a good place for it yet).
@alice-i-cecile requesting removal of the M-Needs-Release-Note label (added in pr description because we don't have a good place for it yet).
Just like migration guides, we should keep this label around even after they're written for searchability and tooling. We might wan to rename that to be more clear though 🤔
@alice-i-cecile requesting removal of the M-Needs-Release-Note label (added in pr description because we don't have a good place for it yet).
Just like migration guides, we should keep this label around even after they're written for searchability and tooling. We might wan to rename that to be more clear though 🤔
M-#[require(ReleaseNote)]
Alright, I added the most basic tests and examples in the world. There will be more to do there when handlers are hooked up. Ready for review.
Might be a bit late on this discussion, but this particular implementation feels a bit intrusive to me. Can we instead implement IntoSystemConfigs on systems that returns Result so that those systems can be added to the schedule like a normal system?
Thank you to everyone involved with the authoring or reviewing of this PR! This work is relatively important and needs release notes! Head over to https://github.com/bevyengine/bevy-website/issues/1967 if you'd like to help out.