bevy icon indicating copy to clipboard operation
bevy copied to clipboard

Queries as entities

Open notmd opened this issue 8 months ago • 11 comments

Objective

  • Close https://github.com/bevyengine/bevy/pull/14668

Solution

  • Add QueryStateWrapper a component and use observer to update it state.
  • Add ArchetypeCreated event. This will be emitted right after the archetype is created and before any entities are moved in.
  • Only QueryState from Query system param is spawned as entity. I choose to do it to minimize the changes. Alot of rendering code is self-manage QueryState. We still spawn duplicated QueryState for now.

Testing

  • Existing ECS tests should pass.

Future work

  • Maybe make QueryState private so we can easily dedup? How do we gonna dedup?
  • Cleanup lifetime of Query
  • Maybe remove new_archetype? It's currently doing nothing.

Benchmark

Here is benchmark run with cargo bench -p benches --bench ecs -- "iter|spawn|system"

Click me
 group                                                 current                                main
-----                                                 -------                                ----
busy_systems/01x_entities_03_systems                  1.05     51.5±6.26µs        ? ?/sec    1.00     48.9±6.08µs        ? ?/sec
busy_systems/01x_entities_09_systems                  1.03    128.3±9.84µs        ? ?/sec    1.00   124.1±10.54µs        ? ?/sec
busy_systems/01x_entities_15_systems                  1.05   211.4±13.84µs        ? ?/sec    1.00   200.7±10.48µs        ? ?/sec
busy_systems/03x_entities_03_systems                  1.14     88.2±6.99µs        ? ?/sec    1.00     77.6±6.38µs        ? ?/sec
busy_systems/03x_entities_09_systems                  1.03   251.2±17.51µs        ? ?/sec    1.00   244.3±10.24µs        ? ?/sec
busy_systems/03x_entities_15_systems                  1.07   418.7±35.82µs        ? ?/sec    1.00   390.8±23.29µs        ? ?/sec
busy_systems/05x_entities_03_systems                  1.04   115.5±11.97µs        ? ?/sec    1.00    110.6±9.84µs        ? ?/sec
busy_systems/05x_entities_09_systems                  1.10   353.6±22.53µs        ? ?/sec    1.00   322.4±28.72µs        ? ?/sec
busy_systems/05x_entities_15_systems                  1.18   634.3±47.69µs        ? ?/sec    1.00   537.2±20.44µs        ? ?/sec
contrived/01x_entities_03_systems                     1.00     28.7±2.24µs        ? ?/sec    1.02     29.3±3.13µs        ? ?/sec
contrived/01x_entities_09_systems                     1.00     77.7±4.55µs        ? ?/sec    1.01     78.8±3.87µs        ? ?/sec
contrived/01x_entities_15_systems                     1.00    124.8±4.73µs        ? ?/sec    1.04    130.1±6.74µs        ? ?/sec
contrived/03x_entities_03_systems                     1.09     41.8±6.29µs        ? ?/sec    1.00     38.5±2.23µs        ? ?/sec
contrived/03x_entities_09_systems                     1.10    125.7±6.35µs        ? ?/sec    1.00    113.9±6.95µs        ? ?/sec
contrived/03x_entities_15_systems                     1.08   203.8±13.18µs        ? ?/sec    1.00    188.5±9.71µs        ? ?/sec
contrived/05x_entities_03_systems                     1.13     56.4±7.48µs        ? ?/sec    1.00     50.0±3.17µs        ? ?/sec
contrived/05x_entities_09_systems                     1.07   170.8±11.62µs        ? ?/sec    1.00   159.6±15.62µs        ? ?/sec
contrived/05x_entities_15_systems                     1.04   282.5±13.63µs        ? ?/sec    1.00   271.2±19.13µs        ? ?/sec
despawn_world/10000_entities                          1.00   564.7±24.23µs        ? ?/sec    1.15   648.3±15.45µs        ? ?/sec
despawn_world/100_entities                            1.00      6.8±0.34µs        ? ?/sec    1.19      8.1±0.34µs        ? ?/sec
despawn_world/1_entities                              1.08   329.4±25.59ns        ? ?/sec    1.00   304.5±37.90ns        ? ?/sec
despawn_world_recursive/10000_entities                1.00      2.9±0.07ms        ? ?/sec    1.14      3.3±0.02ms        ? ?/sec
despawn_world_recursive/100_entities                  1.00     31.6±1.36µs        ? ?/sec    1.13     35.6±0.73µs        ? ?/sec
despawn_world_recursive/1_entities                    1.01   770.9±29.13ns        ? ?/sec    1.00   766.1±48.00ns        ? ?/sec
empty_archetypes/iter/10                              1.05      8.9±1.41µs        ? ?/sec    1.00      8.5±1.01µs        ? ?/sec
empty_archetypes/iter/100                             1.04      8.3±0.72µs        ? ?/sec    1.00      8.0±0.56µs        ? ?/sec
empty_archetypes/iter/1000                            1.05      8.7±0.65µs        ? ?/sec    1.00      8.3±0.70µs        ? ?/sec
empty_archetypes/iter/10000                           1.00     14.3±1.43µs        ? ?/sec    1.01     14.5±0.95µs        ? ?/sec
empty_systems/0_systems                               1.00      6.2±0.30ns        ? ?/sec    1.21      7.5±0.53ns        ? ?/sec
empty_systems/1000_systems                            1.03   762.7±42.07µs        ? ?/sec    1.00   740.1±31.10µs        ? ?/sec
empty_systems/100_systems                             1.00     78.9±3.93µs        ? ?/sec    1.00     78.6±3.87µs        ? ?/sec
empty_systems/10_systems                              1.00     11.3±0.83µs        ? ?/sec    1.00     11.3±0.85µs        ? ?/sec
empty_systems/2_systems                               1.09      9.5±1.51µs        ? ?/sec    1.00      8.7±1.12µs        ? ?/sec
empty_systems/4_systems                               1.00      9.5±0.88µs        ? ?/sec    1.00      9.5±1.09µs        ? ?/sec
events_iter/size_16_events_100                        1.00     97.9±4.30ns        ? ?/sec    1.28    125.0±1.57ns        ? ?/sec
events_iter/size_16_events_1000                       1.00    978.8±7.51ns        ? ?/sec    1.22  1189.7±18.72ns        ? ?/sec
events_iter/size_16_events_10000                      1.00      9.2±0.41µs        ? ?/sec    1.29     11.8±0.23µs        ? ?/sec
events_iter/size_4_events_100                         1.00    104.7±0.94ns        ? ?/sec    1.20    125.9±5.35ns        ? ?/sec
events_iter/size_4_events_1000                        1.00   985.7±10.82ns        ? ?/sec    1.19  1176.7±18.38ns        ? ?/sec
events_iter/size_4_events_10000                       1.00      9.2±0.40µs        ? ?/sec    1.29     11.9±0.22µs        ? ?/sec
events_iter/size_512_events_100                       1.30    104.3±0.89ns        ? ?/sec    1.00     79.9±2.40ns        ? ?/sec
events_iter/size_512_events_1000                      1.32    983.6±7.92ns        ? ?/sec    1.00   742.4±26.97ns        ? ?/sec
events_iter/size_512_events_10000                     1.31      9.8±0.10µs        ? ?/sec    1.00      7.5±0.24µs        ? ?/sec
iter_fragmented(4096)_empty/foreach_sparse            1.00     14.9±0.46µs        ? ?/sec    1.16     17.3±0.24µs        ? ?/sec
iter_fragmented(4096)_empty/foreach_table             1.00      4.5±0.07µs        ? ?/sec    1.01      4.6±0.11µs        ? ?/sec
iter_fragmented/base                                  1.00    570.9±8.72ns        ? ?/sec    1.09   625.1±10.87ns        ? ?/sec
iter_fragmented/foreach                               1.00   216.5±17.43ns        ? ?/sec    1.01   217.6±23.03ns        ? ?/sec
iter_fragmented/foreach_wide                          1.53      7.8±0.16µs        ? ?/sec    1.00      5.1±0.15µs        ? ?/sec
iter_fragmented/wide                                  1.05      6.8±0.13µs        ? ?/sec    1.00      6.5±0.20µs        ? ?/sec
iter_fragmented_sparse/base                           1.05      9.9±0.19ns        ? ?/sec    1.00      9.5±0.21ns        ? ?/sec
iter_fragmented_sparse/foreach                        1.10     10.6±0.15ns        ? ?/sec    1.00      9.6±0.33ns        ? ?/sec
iter_fragmented_sparse/foreach_wide                   1.45     79.1±0.87ns        ? ?/sec    1.00     54.6±3.64ns        ? ?/sec
iter_fragmented_sparse/wide                           1.03     75.0±1.62ns        ? ?/sec    1.00     72.7±1.78ns        ? ?/sec
iter_simple/base                                      1.33     16.0±0.51µs        ? ?/sec    1.00     12.0±0.10µs        ? ?/sec
iter_simple/foreach                                   1.00     11.4±0.09µs        ? ?/sec    1.06     12.1±0.10µs        ? ?/sec
iter_simple/foreach_hybrid                            1.05     16.1±0.25µs        ? ?/sec    1.00     15.3±0.23µs        ? ?/sec
iter_simple/foreach_sparse_set                        1.00     23.1±0.22µs        ? ?/sec    1.00     23.1±0.49µs        ? ?/sec
iter_simple/foreach_wide                              1.34     62.0±0.60µs        ? ?/sec    1.00     46.2±0.73µs        ? ?/sec
iter_simple/foreach_wide_sparse_set                   1.07    129.2±2.19µs        ? ?/sec    1.00    120.9±1.86µs        ? ?/sec
iter_simple/sparse_set                                1.00     23.9±0.22µs        ? ?/sec    1.12     26.7±0.38µs        ? ?/sec
iter_simple/system                                    1.02     12.3±0.09µs        ? ?/sec    1.00     12.1±0.10µs        ? ?/sec
iter_simple/wide                                      1.02     57.5±0.65µs        ? ?/sec    1.00     56.2±0.47µs        ? ?/sec
iter_simple/wide_sparse_set                           1.00    111.1±1.75µs        ? ?/sec    1.12    125.0±1.14µs        ? ?/sec
no_archetypes/system_count/0                          1.02     12.7±0.27ns        ? ?/sec    1.00     12.4±0.20ns        ? ?/sec
no_archetypes/system_count/10                         1.03    195.6±1.99ns        ? ?/sec    1.00    189.1±4.38ns        ? ?/sec
no_archetypes/system_count/100                        1.06  1781.8±13.96ns        ? ?/sec    1.00  1685.5±100.87ns        ? ?/sec
par_iter_simple/hybrid                                1.00    104.1±6.75µs        ? ?/sec    1.01    104.7±6.22µs        ? ?/sec
par_iter_simple/with_0_fragment                       1.10     58.4±4.20µs        ? ?/sec    1.00     52.9±6.37µs        ? ?/sec
par_iter_simple/with_1000_fragment                    1.09     70.1±5.76µs        ? ?/sec    1.00     64.4±9.52µs        ? ?/sec
par_iter_simple/with_100_fragment                     1.10     62.3±3.66µs        ? ?/sec    1.00     56.4±4.75µs        ? ?/sec
par_iter_simple/with_10_fragment                      1.09     60.3±2.71µs        ? ?/sec    1.00     55.5±4.68µs        ? ?/sec
param/combinator_system/8_dyn_params_system           1.10     21.2±5.16µs        ? ?/sec    1.00     19.2±0.65µs        ? ?/sec
param/combinator_system/8_piped_systems               1.09      9.5±1.53µs        ? ?/sec    1.00      8.7±0.60µs        ? ?/sec
param/combinator_system/8_variant_param_set_system    1.03      9.7±1.27µs        ? ?/sec    1.00      9.4±0.73µs        ? ?/sec
run_condition/no/1000_systems                         1.18     58.9±0.42µs        ? ?/sec    1.00     49.8±2.49µs        ? ?/sec
run_condition/no/100_systems                          1.12      4.1±0.03µs        ? ?/sec    1.00      3.6±0.19µs        ? ?/sec
run_condition/no/10_systems                           1.00    495.6±2.45ns        ? ?/sec    1.01    499.8±7.24ns        ? ?/sec
run_condition/yes/1000_systems                        1.01   739.2±32.26µs        ? ?/sec    1.00   732.6±33.87µs        ? ?/sec
run_condition/yes/100_systems                         1.00     77.9±3.99µs        ? ?/sec    1.01     78.5±4.89µs        ? ?/sec
run_condition/yes/10_systems                          1.00     10.9±0.60µs        ? ?/sec    1.01     11.1±0.93µs        ? ?/sec
run_condition/yes_using_query/1000_systems            1.01   767.4±28.37µs        ? ?/sec    1.00   758.9±30.91µs        ? ?/sec
run_condition/yes_using_query/100_systems             1.03     82.1±5.07µs        ? ?/sec    1.00     79.4±4.53µs        ? ?/sec
run_condition/yes_using_query/10_systems              1.01     11.4±0.99µs        ? ?/sec    1.00     11.3±0.97µs        ? ?/sec
run_condition/yes_using_resource/1000_systems         1.00   742.5±34.84µs        ? ?/sec    1.00   743.2±36.76µs        ? ?/sec
run_condition/yes_using_resource/100_systems          1.00     81.5±4.84µs        ? ?/sec    1.00     81.4±4.79µs        ? ?/sec
run_condition/yes_using_resource/10_systems           1.00     11.1±0.70µs        ? ?/sec    1.02     11.3±0.97µs        ? ?/sec
spawn_commands/10000_entities                         1.02  1267.6±18.85µs        ? ?/sec    1.00  1239.6±16.25µs        ? ?/sec
spawn_commands/1000_entities                          1.00    112.2±5.75µs        ? ?/sec    1.09    122.8±1.98µs        ? ?/sec
spawn_commands/100_entities                           1.04     12.8±0.20µs        ? ?/sec    1.00     12.3±0.18µs        ? ?/sec
spawn_world/10000_entities                            1.00  875.3±196.49µs        ? ?/sec    1.06  931.7±219.48µs        ? ?/sec
spawn_world/100_entities                              1.04      9.0±2.24µs        ? ?/sec    1.00      8.6±2.25µs        ? ?/sec
spawn_world/1_entities                                1.06    90.0±22.41ns        ? ?/sec    1.00    84.5±20.96ns        ? ?/sec
world_query_iter/50000_entities_sparse                1.00     66.7±0.66µs        ? ?/sec    1.02     67.8±0.96µs        ? ?/sec
world_query_iter/50000_entities_table                 1.00     25.1±0.34µs        ? ?/sec    1.00     25.1±0.26µs        ? ?/sec

notmd avatar Apr 16 '25 18:04 notmd

Can you add a test that shows that Query<&mut QueryState> is not causing UB, like via multiple mutable references to the same state? I don't really see a solution to this problem in these changes, though maybe I just missed it.

EDIT: For example a system like this, while no other such query exists, should not panic as otherwise it accesses it's state twice, first for using the query and second by finding it via the query.

fn system(query: Query<EntityMut<'static>>) {
    for mut entity_mut in query {
        entity_mut
            .get_mut::<QueryState<EntityMut<'static>>>()
            .is_none_or(|_aliased_mut| unreachable!());
    }
}

urben1680 avatar Apr 16 '25 19:04 urben1680

Thanks for all the feedback. Making this as draft while I'm working on the fix. Also notice the unsoundness raised by @urben1680, query like Query<EntityMut> is a problem, we might need to add default filter to exclude QueryState<EntityMut> for these type of queries or somehow turn EntityMut to EntityMutExcept<QueryState<EntityMut>>

notmd avatar Apr 17 '25 13:04 notmd

I have updated to use wrapper type and immutable components. I will do a final cleanup and run benchmark after land https://github.com/bevyengine/bevy/pull/16885. Thank everyone for helping.

Can you add a test that shows that Query<&mut QueryState> is not causing UB, like via multiple mutable references to the same state? I don't really see a solution to this problem in these changes, though maybe I just missed it.

EDIT: For example a system like this, while no other such query exists, should not panic as otherwise it accesses it's state twice, first for using the query and second by finding it via the query.

fn system(query: Query<EntityMut<'static>>) {
    for mut entity_mut in query {
        entity_mut
            .get_mut::<QueryState<EntityMut<'static>>>()
            .is_none_or(|_aliased_mut| unreachable!());
    }
}

Immutable component solved this.

Lastly I would like to see a test where you do a bunch of ecs operations and keep count of the number of ArchetypeCreated events and compare that to the number of archetypes in Archetypes. That would also help greatly with brittleness.

I added a debug-only drop check for it.

notmd avatar Apr 18 '25 07:04 notmd

I ran the benchmarks on my old laptop which I don't trust. But the regression on foreach_wide seems real. These benchmarks use QueryState not Query. QueryState now has slightly bigger size and a branch before iteration, but it shouldn't cause 30% regression. Help would be appreciated

notmd avatar May 06 '25 12:05 notmd

Hmm, CI seems stuck or something? Can someone retrigger it? Edit: nvm

notmd avatar May 06 '25 13:05 notmd

Triage: has merge conflicts

janhohenheim avatar May 17 '25 17:05 janhohenheim

It looks very promising. Are there any updates on this PR?

re0312 avatar May 30 '25 00:05 re0312

It looks very promising. Are there any updates on this PR?

This just need to rebase and run benchmark again, probably there will regression. I wont have time to work on this in next few weeks though

notmd avatar May 30 '25 07:05 notmd

You should be able to remove all of the archetype stuff due to #19143.

ItsDoot avatar May 30 '25 21:05 ItsDoot

This just need to rebase and run benchmark again, probably there will regression. I wont have time to work on this in next few weeks though

I’ll have some free time over the next few weeks. Would you mind if I took over your PR?

re0312 avatar May 31 '25 00:05 re0312

This just need to rebase and run benchmark again, probably there will regression. I wont have time to work on this in next few weeks though

I’ll have some free time over the next few weeks. Would you mind if I took over your PR?

Go ahead, I probably can have a review for it.

notmd avatar May 31 '25 10:05 notmd