Queries as entities
Objective
- Close https://github.com/bevyengine/bevy/pull/14668
Solution
- Add
QueryStateWrappera component and use observer to update it state. - Add
ArchetypeCreatedevent. This will be emitted right after the archetype is created and before any entities are moved in. - Only
QueryStatefromQuerysystem param is spawned as entity. I choose to do it to minimize the changes. Alot of rendering code is self-manageQueryState. We still spawn duplicatedQueryStatefor now.
Testing
- Existing ECS tests should pass.
Future work
- Maybe make
QueryStateprivate so we can easily dedup? How do we gonna dedup? - Cleanup lifetime of
Query - Maybe remove
new_archetype? It's currently doing nothing.
Benchmark
Here is benchmark run with cargo bench -p benches --bench ecs -- "iter|spawn|system"
Click me
group current main
----- ------- ----
busy_systems/01x_entities_03_systems 1.05 51.5±6.26µs ? ?/sec 1.00 48.9±6.08µs ? ?/sec
busy_systems/01x_entities_09_systems 1.03 128.3±9.84µs ? ?/sec 1.00 124.1±10.54µs ? ?/sec
busy_systems/01x_entities_15_systems 1.05 211.4±13.84µs ? ?/sec 1.00 200.7±10.48µs ? ?/sec
busy_systems/03x_entities_03_systems 1.14 88.2±6.99µs ? ?/sec 1.00 77.6±6.38µs ? ?/sec
busy_systems/03x_entities_09_systems 1.03 251.2±17.51µs ? ?/sec 1.00 244.3±10.24µs ? ?/sec
busy_systems/03x_entities_15_systems 1.07 418.7±35.82µs ? ?/sec 1.00 390.8±23.29µs ? ?/sec
busy_systems/05x_entities_03_systems 1.04 115.5±11.97µs ? ?/sec 1.00 110.6±9.84µs ? ?/sec
busy_systems/05x_entities_09_systems 1.10 353.6±22.53µs ? ?/sec 1.00 322.4±28.72µs ? ?/sec
busy_systems/05x_entities_15_systems 1.18 634.3±47.69µs ? ?/sec 1.00 537.2±20.44µs ? ?/sec
contrived/01x_entities_03_systems 1.00 28.7±2.24µs ? ?/sec 1.02 29.3±3.13µs ? ?/sec
contrived/01x_entities_09_systems 1.00 77.7±4.55µs ? ?/sec 1.01 78.8±3.87µs ? ?/sec
contrived/01x_entities_15_systems 1.00 124.8±4.73µs ? ?/sec 1.04 130.1±6.74µs ? ?/sec
contrived/03x_entities_03_systems 1.09 41.8±6.29µs ? ?/sec 1.00 38.5±2.23µs ? ?/sec
contrived/03x_entities_09_systems 1.10 125.7±6.35µs ? ?/sec 1.00 113.9±6.95µs ? ?/sec
contrived/03x_entities_15_systems 1.08 203.8±13.18µs ? ?/sec 1.00 188.5±9.71µs ? ?/sec
contrived/05x_entities_03_systems 1.13 56.4±7.48µs ? ?/sec 1.00 50.0±3.17µs ? ?/sec
contrived/05x_entities_09_systems 1.07 170.8±11.62µs ? ?/sec 1.00 159.6±15.62µs ? ?/sec
contrived/05x_entities_15_systems 1.04 282.5±13.63µs ? ?/sec 1.00 271.2±19.13µs ? ?/sec
despawn_world/10000_entities 1.00 564.7±24.23µs ? ?/sec 1.15 648.3±15.45µs ? ?/sec
despawn_world/100_entities 1.00 6.8±0.34µs ? ?/sec 1.19 8.1±0.34µs ? ?/sec
despawn_world/1_entities 1.08 329.4±25.59ns ? ?/sec 1.00 304.5±37.90ns ? ?/sec
despawn_world_recursive/10000_entities 1.00 2.9±0.07ms ? ?/sec 1.14 3.3±0.02ms ? ?/sec
despawn_world_recursive/100_entities 1.00 31.6±1.36µs ? ?/sec 1.13 35.6±0.73µs ? ?/sec
despawn_world_recursive/1_entities 1.01 770.9±29.13ns ? ?/sec 1.00 766.1±48.00ns ? ?/sec
empty_archetypes/iter/10 1.05 8.9±1.41µs ? ?/sec 1.00 8.5±1.01µs ? ?/sec
empty_archetypes/iter/100 1.04 8.3±0.72µs ? ?/sec 1.00 8.0±0.56µs ? ?/sec
empty_archetypes/iter/1000 1.05 8.7±0.65µs ? ?/sec 1.00 8.3±0.70µs ? ?/sec
empty_archetypes/iter/10000 1.00 14.3±1.43µs ? ?/sec 1.01 14.5±0.95µs ? ?/sec
empty_systems/0_systems 1.00 6.2±0.30ns ? ?/sec 1.21 7.5±0.53ns ? ?/sec
empty_systems/1000_systems 1.03 762.7±42.07µs ? ?/sec 1.00 740.1±31.10µs ? ?/sec
empty_systems/100_systems 1.00 78.9±3.93µs ? ?/sec 1.00 78.6±3.87µs ? ?/sec
empty_systems/10_systems 1.00 11.3±0.83µs ? ?/sec 1.00 11.3±0.85µs ? ?/sec
empty_systems/2_systems 1.09 9.5±1.51µs ? ?/sec 1.00 8.7±1.12µs ? ?/sec
empty_systems/4_systems 1.00 9.5±0.88µs ? ?/sec 1.00 9.5±1.09µs ? ?/sec
events_iter/size_16_events_100 1.00 97.9±4.30ns ? ?/sec 1.28 125.0±1.57ns ? ?/sec
events_iter/size_16_events_1000 1.00 978.8±7.51ns ? ?/sec 1.22 1189.7±18.72ns ? ?/sec
events_iter/size_16_events_10000 1.00 9.2±0.41µs ? ?/sec 1.29 11.8±0.23µs ? ?/sec
events_iter/size_4_events_100 1.00 104.7±0.94ns ? ?/sec 1.20 125.9±5.35ns ? ?/sec
events_iter/size_4_events_1000 1.00 985.7±10.82ns ? ?/sec 1.19 1176.7±18.38ns ? ?/sec
events_iter/size_4_events_10000 1.00 9.2±0.40µs ? ?/sec 1.29 11.9±0.22µs ? ?/sec
events_iter/size_512_events_100 1.30 104.3±0.89ns ? ?/sec 1.00 79.9±2.40ns ? ?/sec
events_iter/size_512_events_1000 1.32 983.6±7.92ns ? ?/sec 1.00 742.4±26.97ns ? ?/sec
events_iter/size_512_events_10000 1.31 9.8±0.10µs ? ?/sec 1.00 7.5±0.24µs ? ?/sec
iter_fragmented(4096)_empty/foreach_sparse 1.00 14.9±0.46µs ? ?/sec 1.16 17.3±0.24µs ? ?/sec
iter_fragmented(4096)_empty/foreach_table 1.00 4.5±0.07µs ? ?/sec 1.01 4.6±0.11µs ? ?/sec
iter_fragmented/base 1.00 570.9±8.72ns ? ?/sec 1.09 625.1±10.87ns ? ?/sec
iter_fragmented/foreach 1.00 216.5±17.43ns ? ?/sec 1.01 217.6±23.03ns ? ?/sec
iter_fragmented/foreach_wide 1.53 7.8±0.16µs ? ?/sec 1.00 5.1±0.15µs ? ?/sec
iter_fragmented/wide 1.05 6.8±0.13µs ? ?/sec 1.00 6.5±0.20µs ? ?/sec
iter_fragmented_sparse/base 1.05 9.9±0.19ns ? ?/sec 1.00 9.5±0.21ns ? ?/sec
iter_fragmented_sparse/foreach 1.10 10.6±0.15ns ? ?/sec 1.00 9.6±0.33ns ? ?/sec
iter_fragmented_sparse/foreach_wide 1.45 79.1±0.87ns ? ?/sec 1.00 54.6±3.64ns ? ?/sec
iter_fragmented_sparse/wide 1.03 75.0±1.62ns ? ?/sec 1.00 72.7±1.78ns ? ?/sec
iter_simple/base 1.33 16.0±0.51µs ? ?/sec 1.00 12.0±0.10µs ? ?/sec
iter_simple/foreach 1.00 11.4±0.09µs ? ?/sec 1.06 12.1±0.10µs ? ?/sec
iter_simple/foreach_hybrid 1.05 16.1±0.25µs ? ?/sec 1.00 15.3±0.23µs ? ?/sec
iter_simple/foreach_sparse_set 1.00 23.1±0.22µs ? ?/sec 1.00 23.1±0.49µs ? ?/sec
iter_simple/foreach_wide 1.34 62.0±0.60µs ? ?/sec 1.00 46.2±0.73µs ? ?/sec
iter_simple/foreach_wide_sparse_set 1.07 129.2±2.19µs ? ?/sec 1.00 120.9±1.86µs ? ?/sec
iter_simple/sparse_set 1.00 23.9±0.22µs ? ?/sec 1.12 26.7±0.38µs ? ?/sec
iter_simple/system 1.02 12.3±0.09µs ? ?/sec 1.00 12.1±0.10µs ? ?/sec
iter_simple/wide 1.02 57.5±0.65µs ? ?/sec 1.00 56.2±0.47µs ? ?/sec
iter_simple/wide_sparse_set 1.00 111.1±1.75µs ? ?/sec 1.12 125.0±1.14µs ? ?/sec
no_archetypes/system_count/0 1.02 12.7±0.27ns ? ?/sec 1.00 12.4±0.20ns ? ?/sec
no_archetypes/system_count/10 1.03 195.6±1.99ns ? ?/sec 1.00 189.1±4.38ns ? ?/sec
no_archetypes/system_count/100 1.06 1781.8±13.96ns ? ?/sec 1.00 1685.5±100.87ns ? ?/sec
par_iter_simple/hybrid 1.00 104.1±6.75µs ? ?/sec 1.01 104.7±6.22µs ? ?/sec
par_iter_simple/with_0_fragment 1.10 58.4±4.20µs ? ?/sec 1.00 52.9±6.37µs ? ?/sec
par_iter_simple/with_1000_fragment 1.09 70.1±5.76µs ? ?/sec 1.00 64.4±9.52µs ? ?/sec
par_iter_simple/with_100_fragment 1.10 62.3±3.66µs ? ?/sec 1.00 56.4±4.75µs ? ?/sec
par_iter_simple/with_10_fragment 1.09 60.3±2.71µs ? ?/sec 1.00 55.5±4.68µs ? ?/sec
param/combinator_system/8_dyn_params_system 1.10 21.2±5.16µs ? ?/sec 1.00 19.2±0.65µs ? ?/sec
param/combinator_system/8_piped_systems 1.09 9.5±1.53µs ? ?/sec 1.00 8.7±0.60µs ? ?/sec
param/combinator_system/8_variant_param_set_system 1.03 9.7±1.27µs ? ?/sec 1.00 9.4±0.73µs ? ?/sec
run_condition/no/1000_systems 1.18 58.9±0.42µs ? ?/sec 1.00 49.8±2.49µs ? ?/sec
run_condition/no/100_systems 1.12 4.1±0.03µs ? ?/sec 1.00 3.6±0.19µs ? ?/sec
run_condition/no/10_systems 1.00 495.6±2.45ns ? ?/sec 1.01 499.8±7.24ns ? ?/sec
run_condition/yes/1000_systems 1.01 739.2±32.26µs ? ?/sec 1.00 732.6±33.87µs ? ?/sec
run_condition/yes/100_systems 1.00 77.9±3.99µs ? ?/sec 1.01 78.5±4.89µs ? ?/sec
run_condition/yes/10_systems 1.00 10.9±0.60µs ? ?/sec 1.01 11.1±0.93µs ? ?/sec
run_condition/yes_using_query/1000_systems 1.01 767.4±28.37µs ? ?/sec 1.00 758.9±30.91µs ? ?/sec
run_condition/yes_using_query/100_systems 1.03 82.1±5.07µs ? ?/sec 1.00 79.4±4.53µs ? ?/sec
run_condition/yes_using_query/10_systems 1.01 11.4±0.99µs ? ?/sec 1.00 11.3±0.97µs ? ?/sec
run_condition/yes_using_resource/1000_systems 1.00 742.5±34.84µs ? ?/sec 1.00 743.2±36.76µs ? ?/sec
run_condition/yes_using_resource/100_systems 1.00 81.5±4.84µs ? ?/sec 1.00 81.4±4.79µs ? ?/sec
run_condition/yes_using_resource/10_systems 1.00 11.1±0.70µs ? ?/sec 1.02 11.3±0.97µs ? ?/sec
spawn_commands/10000_entities 1.02 1267.6±18.85µs ? ?/sec 1.00 1239.6±16.25µs ? ?/sec
spawn_commands/1000_entities 1.00 112.2±5.75µs ? ?/sec 1.09 122.8±1.98µs ? ?/sec
spawn_commands/100_entities 1.04 12.8±0.20µs ? ?/sec 1.00 12.3±0.18µs ? ?/sec
spawn_world/10000_entities 1.00 875.3±196.49µs ? ?/sec 1.06 931.7±219.48µs ? ?/sec
spawn_world/100_entities 1.04 9.0±2.24µs ? ?/sec 1.00 8.6±2.25µs ? ?/sec
spawn_world/1_entities 1.06 90.0±22.41ns ? ?/sec 1.00 84.5±20.96ns ? ?/sec
world_query_iter/50000_entities_sparse 1.00 66.7±0.66µs ? ?/sec 1.02 67.8±0.96µs ? ?/sec
world_query_iter/50000_entities_table 1.00 25.1±0.34µs ? ?/sec 1.00 25.1±0.26µs ? ?/sec
Can you add a test that shows that Query<&mut QueryState> is not causing UB, like via multiple mutable references to the same state? I don't really see a solution to this problem in these changes, though maybe I just missed it.
EDIT: For example a system like this, while no other such query exists, should not panic as otherwise it accesses it's state twice, first for using the query and second by finding it via the query.
fn system(query: Query<EntityMut<'static>>) {
for mut entity_mut in query {
entity_mut
.get_mut::<QueryState<EntityMut<'static>>>()
.is_none_or(|_aliased_mut| unreachable!());
}
}
Thanks for all the feedback. Making this as draft while I'm working on the fix. Also notice the unsoundness raised by @urben1680, query like Query<EntityMut> is a problem, we might need to add default filter to exclude QueryState<EntityMut> for these type of queries or somehow turn EntityMut to EntityMutExcept<QueryState<EntityMut>>
I have updated to use wrapper type and immutable components. I will do a final cleanup and run benchmark after land https://github.com/bevyengine/bevy/pull/16885. Thank everyone for helping.
Can you add a test that shows that
Query<&mut QueryState>is not causing UB, like via multiple mutable references to the same state? I don't really see a solution to this problem in these changes, though maybe I just missed it.EDIT: For example a system like this, while no other such query exists, should not panic as otherwise it accesses it's state twice, first for using the query and second by finding it via the query.
fn system(query: Query<EntityMut<'static>>) { for mut entity_mut in query { entity_mut .get_mut::<QueryState<EntityMut<'static>>>() .is_none_or(|_aliased_mut| unreachable!()); } }
Immutable component solved this.
Lastly I would like to see a test where you do a bunch of ecs operations and keep count of the number of ArchetypeCreated events and compare that to the number of archetypes in Archetypes. That would also help greatly with brittleness.
I added a debug-only drop check for it.
I ran the benchmarks on my old laptop which I don't trust. But the regression on foreach_wide seems real. These benchmarks use QueryState not Query. QueryState now has slightly bigger size and a branch before iteration, but it shouldn't cause 30% regression. Help would be appreciated
Hmm, CI seems stuck or something? Can someone retrigger it? Edit: nvm
Triage: has merge conflicts
It looks very promising. Are there any updates on this PR?
It looks very promising. Are there any updates on this PR?
This just need to rebase and run benchmark again, probably there will regression. I wont have time to work on this in next few weeks though
You should be able to remove all of the archetype stuff due to #19143.
This just need to rebase and run benchmark again, probably there will regression. I wont have time to work on this in next few weeks though
I’ll have some free time over the next few weeks. Would you mind if I took over your PR?
This just need to rebase and run benchmark again, probably there will regression. I wont have time to work on this in next few weeks though
I’ll have some free time over the next few weeks. Would you mind if I took over your PR?
Go ahead, I probably can have a review for it.