bevy icon indicating copy to clipboard operation
bevy copied to clipboard

Optimize param validation through `get_param(...) -> Option<Out>`

Open MiniaczQ opened this issue 1 year ago • 6 comments

Objective

Fixes #15505 through a new solution which came up during development. It changes some of the original assumptions about checking all system params before running the systems, which isn't correct for combinator systems (they can invalidate their own sub-system params). Alternative to #15571

Solution

get_param returns an Option, which makes run_unsafe also return an option. validate_param now defaults to get_param().is_some() and we overwrite it for non-send and grouped params to behave correctly.

The original assumption about validating all params before running a system is no longer valid. Combinator system can invalidate it's own params in a case of: a.pipe(b).pipe(a) where b removes required params for a. Because of that, we will no longer validate all params of a system, but only those of the first one. This allows us to prevent spawning tasks if we cannot start running a system and reduced access overhead, because we only check a single subsystem's params immediately followed by getting them (albeit in another thread).

Testing

Ran parts of CI locally, Ran fallible_params example.

MiniaczQ avatar Oct 02 '24 23:10 MiniaczQ

The steps taken to get here look as following:

  1. We question whether 1.2 (from #15505) is possible to implement. While lifetimes and type erasure can be worked around, we can't fetch all system parameters ahead of running the system, because in case of compound systems (e.g. a.pipe(b)), the respective sub-systems can have mutually exclusive access. While this isn't technically UB, since the parameters will only be used in mutually exclusive contexts, this will lead to multiple im-/mutable references to the same data existing at once.
  2. To get to 1.2, we need proposition 1.1 anyways. It gives better internal ergonomics than current solution. Because it's only applied when running systems, we don't save on tasks in multithreaded executors nor do we retain the original behavior of running only entire compound systems in executors (e.g. a.pipe(b) could run a but stop at b since params were "checked" lazily).
  3. We can fix that by reintroducing validate_param, but default it's implementation as get_param().is_some(). We expect this implementation to be optimized by compiler. Because default implementation breaks for non-send resources, we overwrite it. We also overwrite it for compound resources (SystemParam tuples, ParamSet, DynSystemParam) to rely on their sub-params implementations.

To sum it up, without going the additional step into proposition 1.2 we simply cannot reduce double checks in executors.

An alternative to 1.2 idea could split "fetching" into 2 steps:

  • checking availability and creating an "access key" - validate_param() -> Option<Accessor>
  • using the "access key" to construct system params - get_param(Accessor)
  • to avoid additional boxing in executors, Acccessors can be stored inside the bottom-level Systems (exclusive/function) when fetch succeeds and removed after running the system.

MiniaczQ avatar Oct 03 '24 22:10 MiniaczQ

Marking it as ready for review since code is basically done, just needs benchmarks

MiniaczQ avatar Oct 04 '24 14:10 MiniaczQ

Note for @alice-i-cecile, since I've only made groundwork towards 1.2, but not implemented it, this doesn't have a significant impact on performance and we can delay it. I've had trouble finding all the broken code, but now that I've done it once it should go much quicker

MiniaczQ avatar Oct 04 '24 19:10 MiniaczQ

I found an edge case that asks for change in requirements: https://discord.com/channels/691052431525675048/749335865876021248/1292140022375514145 I'm re-drafting this until I implement it

MiniaczQ avatar Oct 05 '24 15:10 MiniaczQ

Core changes are done, just need to update documentation

MiniaczQ avatar Oct 12 '24 15:10 MiniaczQ

Before (main):

param/combinator_system/8_piped_systems
                        time:   [1.6549 µs 1.6689 µs 1.6852 µs]
                        change: [-13.250% -10.268% -7.3630%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

param/combinator_system/8_dyn_params_system
                        time:   [95.545 ns 95.841 ns 96.172 ns]
                        change: [-2.9593% -1.3542% -0.0548%] (p = 0.06 > 0.05)
                        No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

param/combinator_system/8_variant_param_set_system
                        time:   [91.596 ns 91.962 ns 92.365 ns]
                        change: [+0.1790% +1.0297% +1.8957%] (p = 0.02 < 0.05)
                        Change within noise threshold.
Found 9 outliers among 100 measurements (9.00%)
  6 (6.00%) high mild
  3 (3.00%) high severe

After:

param/combinator_system/8_piped_systems
                        time:   [1.6590 µs 1.6863 µs 1.7149 µs]
                        change: [-3.5415% -1.5778% +0.5378%] (p = 0.15 > 0.05)
                        No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild

param/combinator_system/8_dyn_params_system
                        time:   [93.629 ns 94.065 ns 94.551 ns]
                        change: [-2.3199% -1.2717% -0.1379%] (p = 0.01 < 0.05)
                        Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe

param/combinator_system/8_variant_param_set_system
                        time:   [91.913 ns 92.361 ns 92.842 ns]
                        change: [+0.0434% +0.8115% +1.6509%] (p = 0.05 < 0.05)
                        Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe

There is a lot of noise in the measurements, but at the very least I don't think this is a regression

It'd be good if few more people measured, I use cargo bench -- param/ Here is the specific main commit I used acbed6040eeae99199657e44a9668772738ac344.

MiniaczQ avatar Oct 15 '24 13:10 MiniaczQ

@MiniaczQ, this looks like it should be reviewable right? There's a lot of merge conflicts now though 😅

alice-i-cecile avatar Feb 26 '25 00:02 alice-i-cecile

@MiniaczQ, this looks like it should be reviewable right? There's a lot of merge conflicts now though 😅

It used to be, all the ideas are here, it just affects such basic APIs that everything conflicts. I don't have time to resolve this right now, there is a good chance that starting from scratch will be faster, since most of the changes can be regexed

Now this also needs to mesh with fallible systems, so that's more work

MiniaczQ avatar Feb 26 '25 13:02 MiniaczQ

The changes look reasonable, but it seems like a rebase would change enough that I should wait before clicking Approve.

I don't quite follow the motivation. It seems like the original goal was to avoid a double lookup, but then you determined that wouldn't be possible. Is the current goal to remove duplication between Single::get_param and Single::validate_param and consolidate the calls to try_warn_param in one place? Oh, and to change the behavior of PipeSystem so that the second system's parameters are only validated after the first system runs?

The changes to SystemState::get seem unfortunate. That's a lot of unwrap()s! Is the reasoning that cases like SystemState::<Single>::get might now fail, and we need a way to express that?

Do we expect anyone use fallible parameters with SystemState? The advantage in regular systems is that it gives a way to declare a run condition as part of the system definition, but that doesn't apply to SystemState. I'd vote for leaving SystemState::get as panicking so that uses who don't use fallible parameters don't need to unwrap, and then exposing a try_get for users who do. (Or, if we really want to reduce accidental panics, we could add trait InfallibleSystemParam: SystemParam and bound get on it.)

chescock avatar Feb 26 '25 20:02 chescock

I don't quite follow the motivation. It seems like the original goal was to avoid a double lookup, but then you determined that wouldn't be possible. Is the current goal to remove duplication between Single::get_param and Single::validate_param and consolidate the calls to try_warn_param in one place? Oh, and to change the behavior of PipeSystem so that the second system's parameters are only validated after the first system runs?

The core issue is that we cannot validate params anywhere, but directly before running a system. In case of a.pipe(b) system a can invalidate bs params. With this restraint, double lookup is no longer an issue, since we don't check bs params until we run it.

The changes to SystemState::get seem unfortunate. That's a lot of unwrap()s! Is the reasoning that cases like SystemState::<Single>::get might now fail, and we need a way to express that?

It always could have failed, but we agreed a failure results in a panic. There aren't many unwraps in the core code, since failures are forwarded to System::run. It's mostly the test cases that suffer (fully justifiable) and very few inner API consumers, which unwrap under the contract that all the passes system arguments are available.

Do we expect anyone use fallible parameters with SystemState? The advantage in regular systems is that it gives a way to declare a run condition as part of the system definition, but that doesn't apply to SystemState. I'd vote for leaving SystemState::get as panicking so that uses who don't use fallible parameters don't need to unwrap, and then exposing a try_get for users who do. (Or, if we really want to reduce accidental panics, we could add trait InfallibleSystemParam: SystemParam and bound get on it.)

The exact API wasn't agreed upon, i guess this is a good split point for the rewrite, since it drastically reduces the API changes. If we want non-panicking default we can do a followup. As for SystemState, it's still an API that can fail and users should be able to handle that.

MiniaczQ avatar Feb 27 '25 09:02 MiniaczQ

The core issue is that we cannot validate params anywhere, but directly before running a system. In case of a.pipe(b) system a can invalidate bs params. With this restraint, double lookup is no longer an issue, since we don't check bs params until we run it.

Yup, that makes sense! I think I'm just asking to update the "Objective" section in the PR. It links to an issue that talks about getting rid of all of the double lookups, but this PR doesn't get rid of them for ordinary systems in the multi-threaded executor, so it's not immediately clear what the goal is here.

It always could have failed, but we agreed a failure results in a panic.

It depends on the parameter, though, right? Like, SystemState::<Single>::get can fail, but SystemState::<Query>::get will always return Some. If you know you have a Query, it would feel bad to have to add an extra unwrap()! And my impression is that most uses of SystemState are with infallible parameters like that.

The difference I see between this and things like Query::single and World::resource are that SystemParam fallibility is part of the type rather than an assertion about the runtime state of the program.

In the future, we might even be able to add a InfallibleSystemParam subtrait that returns the param without an Option, similar to how Ord is a subtrait of PartialOrd that returns an Ordering without an Option. We could have SystemState::try_get that works on any param and returns an Option, and SystemState::get that works only on infallible params and returns the actual param. If we do want to go down that path, then forcing users to add an unwrap() now and then remove it later would be extra churn.

I'm not suggesting that any of that should be in this PR, of course! But I think it might be worth making this PR smaller by moving the unwrap() into SystemParam::get so we only need to write it once. It's no worse than what we have now, and it doesn't require updating every caller.

chescock avatar Feb 28 '25 00:02 chescock

It depends on the parameter, though, right? Like, SystemState::<Single>::get can fail, but SystemState::<Query>::get will always return Some. If you know you have a Query, it would feel bad to have to add an extra unwrap()! And my impression is that most uses of SystemState are with infallible parameters like that.

We try to store param's ability to fail in type and design an API around that. Not sure if that's worth the effort and boilerplate though.

Since this PR I've read some discussion about fallibility and few new decisions include removing per-system param fail behavior. This should be a app-global behavior or a wrapper around individual system params. We can account for the latter case in w/e API this ends up being.

Also, since this is a pretty old PR with lots of outdated information and crap ton of merge conflicts, I'd rather have it closed whenever someone picks up the overall issue.

MiniaczQ avatar Feb 28 '25 18:02 MiniaczQ