bevy icon indicating copy to clipboard operation
bevy copied to clipboard

Clean up Fetch code

Open james7132 opened this issue 3 years ago • 22 comments

Objective

Clean up code surrounding fetch by pulling out the common parts into the iteration code.

Solution

Merge Fetch::table_fetch and Fetch::archetype_fetch into a single API: Fetch::fetch(&mut self, entity: &Entity, table_row: &usize). This provides everything any fetch requires to internally decide which storage to read from and get the underlying data. All of these functions are marked as #[inline(always)] and the arguments are passed as references to attempt to optimize out the argument that isn't being used.

External to Fetch, Query iteration has been changed to keep track of the table row and entity outside of fetch, which moves a lot of the expensive bookkeeping Fetch structs had previously done internally into the outer loop.

~~TODO: Benchmark, docs~~ Done.


Changelog

Changed: Fetch::table_fetch and Fetch::archetype_fetch have been merged into a single Fetch::fetch function.

Migration Guide

TODO

james7132 avatar May 18 '22 23:05 james7132

Did a quick round of benchmarks. Generally looks to be unchanged, though there are some regressions, particularly with Query::iter. I have an idea on how to address it. The main change to sparse iteration is the addition of two slice::get_unchecked calls versus the one before. It may be better to collocate the Entity and table indexes together to get better cache behavior when hitting this section.

The other thing to point out is the giant speedup in Query::get performance, which makes sense since the FetchState::set_archetype calls for sparse components is effectively a no-op now. A similarly significant, but relatively smaller speedup can be seen in the table benchmark for it as well. We should verify this with additional tests, as something like this should positively affect engine systems like transform propagation (assuming that's not dominated by memory bandwidth).

Bar the aforementioned regressions, assuming the benchmarks here are consistent, this seems like a workable change.

group                                                    fetch-cleanup                            main
-----                                                    -------------                            ----
busy_systems/01x_entities_03_systems                     1.09     36.3±1.46µs        ? ?/sec      1.00     33.4±1.57µs        ? ?/sec 
busy_systems/01x_entities_06_systems                     1.05     69.6±3.26µs        ? ?/sec      1.00     66.5±3.70µs        ? ?/sec 
busy_systems/01x_entities_09_systems                     1.03    101.5±6.33µs        ? ?/sec      1.00     98.9±3.30µs        ? ?/sec 
busy_systems/01x_entities_12_systems                     1.07    133.5±5.44µs        ? ?/sec      1.00    124.7±6.52µs        ? ?/sec 
busy_systems/01x_entities_15_systems                     1.04    161.6±7.41µs        ? ?/sec      1.00    155.4±5.34µs        ? ?/sec 
busy_systems/02x_entities_03_systems                     1.00     60.6±2.70µs        ? ?/sec      1.05     63.4±2.43µs        ? ?/sec 
busy_systems/02x_entities_06_systems                     1.00    117.1±7.05µs        ? ?/sec      1.06    124.4±6.40µs        ? ?/sec 
busy_systems/02x_entities_09_systems                     1.00   173.6±10.64µs        ? ?/sec      1.04    180.7±7.01µs        ? ?/sec 
busy_systems/02x_entities_12_systems                     1.00   220.4±10.73µs        ? ?/sec      1.10   242.1±11.81µs        ? ?/sec 
busy_systems/02x_entities_15_systems                     1.00    275.5±9.13µs        ? ?/sec      1.06   292.4±10.19µs        ? ?/sec 
busy_systems/03x_entities_03_systems                     1.06     91.4±5.00µs        ? ?/sec      1.00     86.3±5.73µs        ? ?/sec 
busy_systems/03x_entities_06_systems                     1.04    175.5±8.39µs        ? ?/sec      1.00    169.0±8.28µs        ? ?/sec 
busy_systems/03x_entities_09_systems                     1.04    250.4±8.50µs        ? ?/sec      1.00   239.8±11.20µs        ? ?/sec 
busy_systems/03x_entities_12_systems                     1.00    320.7±9.24µs        ? ?/sec      1.03   329.2±16.04µs        ? ?/sec 
busy_systems/03x_entities_15_systems                     1.00   407.7±11.33µs        ? ?/sec      1.00   408.0±12.54µs        ? ?/sec 
busy_systems/04x_entities_03_systems                     1.04    115.8±5.11µs        ? ?/sec      1.00    111.7±5.41µs        ? ?/sec 
busy_systems/04x_entities_06_systems                     1.00    211.5±7.87µs        ? ?/sec      1.03   218.6±10.46µs        ? ?/sec 
busy_systems/04x_entities_09_systems                     1.00   329.6±19.01µs        ? ?/sec      1.00   330.4±16.70µs        ? ?/sec 
busy_systems/04x_entities_12_systems                     1.03   428.7±11.81µs        ? ?/sec      1.00   416.4±14.26µs        ? ?/sec 
busy_systems/04x_entities_15_systems                     1.01   529.4±12.77µs        ? ?/sec      1.00   525.4±16.45µs        ? ?/sec 
busy_systems/05x_entities_03_systems                     1.00    136.4±5.72µs        ? ?/sec      1.15   156.7±15.90µs        ? ?/sec 
busy_systems/05x_entities_06_systems                     1.00   266.8±12.08µs        ? ?/sec      1.07   286.1±11.79µs        ? ?/sec 
busy_systems/05x_entities_09_systems                     1.00   402.7±13.92µs        ? ?/sec      1.06   426.6±20.73µs        ? ?/sec 
busy_systems/05x_entities_12_systems                     1.00   533.6±21.50µs        ? ?/sec      1.05   560.0±17.33µs        ? ?/sec 
busy_systems/05x_entities_15_systems                     1.00   674.8±27.99µs        ? ?/sec      1.04   700.7±44.73µs        ? ?/sec 
contrived/01x_entities_03_systems                        1.30     27.3±2.90µs        ? ?/sec      1.00     21.1±1.24µs        ? ?/sec
contrived/01x_entities_06_systems                        1.03     42.8±3.04µs        ? ?/sec      1.00     41.5±1.58µs        ? ?/sec
contrived/01x_entities_09_systems                        1.01     61.4±4.00µs        ? ?/sec      1.00     60.9±3.89µs        ? ?/sec
contrived/01x_entities_12_systems                        1.01     81.2±4.80µs        ? ?/sec      1.00     80.7±3.33µs        ? ?/sec
contrived/01x_entities_15_systems                        1.00     98.1±5.48µs        ? ?/sec      1.02     99.9±5.96µs        ? ?/sec
contrived/02x_entities_03_systems                        1.08     33.9±2.63µs        ? ?/sec      1.00     31.5±1.46µs        ? ?/sec
contrived/02x_entities_06_systems                        1.00     60.6±2.18µs        ? ?/sec      1.05     63.4±2.95µs        ? ?/sec
contrived/02x_entities_09_systems                        1.00     92.0±5.89µs        ? ?/sec      1.00     91.6±2.73µs        ? ?/sec
contrived/02x_entities_12_systems                        1.06   128.5±11.27µs        ? ?/sec      1.00    121.3±3.12µs        ? ?/sec
contrived/02x_entities_15_systems                        1.00    151.2±7.89µs        ? ?/sec      1.02    153.5±7.67µs        ? ?/sec
contrived/03x_entities_03_systems                        1.02     43.9±2.27µs        ? ?/sec      1.00     43.2±1.46µs        ? ?/sec
contrived/03x_entities_06_systems                        1.05     86.7±5.83µs        ? ?/sec      1.00     82.3±4.09µs        ? ?/sec
contrived/03x_entities_09_systems                        1.01    125.5±8.38µs        ? ?/sec      1.00    124.7±5.31µs        ? ?/sec
contrived/03x_entities_12_systems                        1.00    160.4±3.97µs        ? ?/sec      1.02    164.0±4.07µs        ? ?/sec
contrived/03x_entities_15_systems                        1.03   208.9±12.52µs        ? ?/sec      1.00    202.8±8.56µs        ? ?/sec
contrived/04x_entities_03_systems                        1.02     54.5±2.76µs        ? ?/sec      1.00     53.1±4.04µs        ? ?/sec
contrived/04x_entities_06_systems                        1.03    106.4±6.15µs        ? ?/sec      1.00    103.6±5.81µs        ? ?/sec
contrived/04x_entities_09_systems                        1.03   160.1±10.74µs        ? ?/sec      1.00    154.7±5.95µs        ? ?/sec
contrived/04x_entities_12_systems                        1.00    205.8±9.71µs        ? ?/sec      1.00    205.2±8.46µs        ? ?/sec
contrived/04x_entities_15_systems                        1.01    251.5±9.87µs        ? ?/sec      1.00    248.4±6.81µs        ? ?/sec
contrived/05x_entities_03_systems                        1.00     62.7±3.15µs        ? ?/sec      1.01     63.4±3.52µs        ? ?/sec
contrived/05x_entities_06_systems                        1.00    126.9±5.49µs        ? ?/sec      1.00    127.2±6.37µs        ? ?/sec
contrived/05x_entities_09_systems                        1.00    179.5±5.15µs        ? ?/sec      1.05    187.5±6.80µs        ? ?/sec
contrived/05x_entities_12_systems                        1.00    244.4±7.21µs        ? ?/sec      1.04   254.6±15.85µs        ? ?/sec
contrived/05x_entities_15_systems                        1.01    312.3±9.83µs        ? ?/sec      1.00   310.5±10.72µs        ? ?/sec
fragmented_iter/base                                     1.17    478.1±5.89ns        ? ?/sec      1.00   410.1±18.16ns        ? ?/sec
fragmented_iter/foreach                                  1.00   236.3±25.69ns        ? ?/sec      1.02   241.6±29.92ns        ? ?/sec
heavy_compute/base                                       1.00    355.8±4.53µs        ? ?/sec      1.02    364.5±5.74µs        ? ?/sec
query_get/50000_entities_sparse                          1.00   589.0±31.37µs        ? ?/sec      1.91  1127.2±55.93µs        ? ?/sec
query_get/50000_entities_table                           1.00   457.1±27.07µs        ? ?/sec      1.32   601.6±12.68µs        ? ?/sec
query_get_component/50000_entities_sparse                1.00  1244.8±50.69µs        ? ?/sec      1.03  1287.5±52.23µs        ? ?/sec
query_get_component/50000_entities_table                 1.01  1247.2±108.41µs        ? ?/sec     1.00  1236.8±21.18µs        ? ?/sec
simple_iter/base                                         1.01     13.9±0.74µs        ? ?/sec      1.00     13.7±0.17µs        ? ?/sec
simple_iter/foreach                                      1.00     11.6±0.12µs        ? ?/sec      1.00     11.6±0.18µs        ? ?/sec
simple_iter/sparse                                       1.00     52.0±0.22µs        ? ?/sec      1.18     61.3±0.32µs        ? ?/sec
simple_iter/sparse_foreach                               1.00     45.2±0.19µs        ? ?/sec      1.12     50.4±0.76µs        ? ?/sec
simple_iter/system                                       1.00     13.7±0.29µs        ? ?/sec      1.01     13.8±0.49µs        ? ?/sec
sparse_fragmented_iter/base                              1.00     10.9±0.24ns        ? ?/sec      1.18     12.8±0.86ns        ? ?/sec
sparse_fragmented_iter/foreach                           1.00      8.9±0.22ns        ? ?/sec      1.00      8.9±0.14ns        ? ?/sec
world_query_for_each/50000_entities_sparse               1.03     99.0±1.47µs        ? ?/sec      1.00     95.8±0.91µs        ? ?/sec
world_query_for_each/50000_entities_table                1.00     27.2±0.24µs        ? ?/sec      1.00     27.2±0.10µs        ? ?/sec
world_query_get/50000_entities_sparse                    1.20   478.6±11.07µs        ? ?/sec      1.00   398.2±10.82µs        ? ?/sec
world_query_get/50000_entities_table                     1.00    274.3±4.77µs        ? ?/sec      1.00    273.4±4.26µs        ? ?/sec
world_query_iter/50000_entities_sparse                   1.12    114.9±0.65µs        ? ?/sec      1.00    102.8±3.31µs        ? ?/sec
world_query_iter/50000_entities_table                    1.00     27.3±0.78µs        ? ?/sec      1.00     27.2±0.26µs        ? ?/sec

james7132 avatar May 19 '22 05:05 james7132

I'd consider taking those changes to the performance characteristics as is. Query::get is in the hot path for a lot of things too, and those are awesome improvements.

That said, I'm excited to see how your mitigation ideas work.

alice-i-cecile avatar May 19 '22 13:05 alice-i-cecile

Attempted to merge the entities and rows into one Vec to make it easier for sparse iteration. It seems to address the sparse iteration issues.

group                                                    fetch-cleanup                            fetch-cleanup-with-archetype-entity      main
-----                                                    -------------                            -----------------------------------      ----
busy_systems/01x_entities_03_systems                     1.09     36.3±1.46µs        ? ?/sec      1.14     38.0±1.67µs        ? ?/sec      1.00     33.4±1.57µs        ? ?/sec
busy_systems/01x_entities_06_systems                     1.05     69.6±3.26µs        ? ?/sec      1.17     78.0±5.66µs        ? ?/sec      1.00     66.5±3.70µs        ? ?/sec
busy_systems/01x_entities_09_systems                     1.03    101.5±6.33µs        ? ?/sec      1.10    108.4±5.60µs        ? ?/sec      1.00     98.9±3.30µs        ? ?/sec
busy_systems/01x_entities_12_systems                     1.07    133.5±5.44µs        ? ?/sec      1.16   145.0±10.70µs        ? ?/sec      1.00    124.7±6.52µs        ? ?/sec
busy_systems/01x_entities_15_systems                     1.04    161.6±7.41µs        ? ?/sec      1.17   181.1±11.18µs        ? ?/sec      1.00    155.4±5.34µs        ? ?/sec
busy_systems/02x_entities_03_systems                     1.03     60.6±2.70µs        ? ?/sec      1.00     59.1±2.39µs        ? ?/sec      1.07     63.4±2.43µs        ? ?/sec
busy_systems/02x_entities_06_systems                     1.00    117.1±7.05µs        ? ?/sec      1.01    117.8±6.11µs        ? ?/sec      1.06    124.4±6.40µs        ? ?/sec
busy_systems/02x_entities_09_systems                     1.00   173.6±10.64µs        ? ?/sec      1.00    173.9±7.70µs        ? ?/sec      1.04    180.7±7.01µs        ? ?/sec
busy_systems/02x_entities_12_systems                     1.00   220.4±10.73µs        ? ?/sec      1.05   230.5±11.22µs        ? ?/sec      1.10   242.1±11.81µs        ? ?/sec
busy_systems/02x_entities_15_systems                     1.00    275.5±9.13µs        ? ?/sec      1.03   284.8±15.50µs        ? ?/sec      1.06   292.4±10.19µs        ? ?/sec
busy_systems/03x_entities_03_systems                     1.09     91.4±5.00µs        ? ?/sec      1.00     83.7±3.08µs        ? ?/sec      1.03     86.3±5.73µs        ? ?/sec
busy_systems/03x_entities_06_systems                     1.08    175.5±8.39µs        ? ?/sec      1.00    162.6±5.64µs        ? ?/sec      1.04    169.0±8.28µs        ? ?/sec
busy_systems/03x_entities_09_systems                     1.04    250.4±8.50µs        ? ?/sec      1.02    244.2±9.76µs        ? ?/sec      1.00   239.8±11.20µs        ? ?/sec
busy_systems/03x_entities_12_systems                     1.00    320.7±9.24µs        ? ?/sec      1.02   327.5±18.62µs        ? ?/sec      1.03   329.2±16.04µs        ? ?/sec
busy_systems/03x_entities_15_systems                     1.01   407.7±11.33µs        ? ?/sec      1.00   404.9±16.51µs        ? ?/sec      1.01   408.0±12.54µs        ? ?/sec
busy_systems/04x_entities_03_systems                     1.04    115.8±5.11µs        ? ?/sec      1.06   118.9±11.55µs        ? ?/sec      1.00    111.7±5.41µs        ? ?/sec
busy_systems/04x_entities_06_systems                     1.00    211.5±7.87µs        ? ?/sec      1.02   214.7±12.11µs        ? ?/sec      1.03   218.6±10.46µs        ? ?/sec
busy_systems/04x_entities_09_systems                     1.04   329.6±19.01µs        ? ?/sec      1.00   317.0±20.76µs        ? ?/sec      1.04   330.4±16.70µs        ? ?/sec
busy_systems/04x_entities_12_systems                     1.03   428.7±11.81µs        ? ?/sec      1.02   425.0±15.03µs        ? ?/sec      1.00   416.4±14.26µs        ? ?/sec
busy_systems/04x_entities_15_systems                     1.01   529.4±12.77µs        ? ?/sec      1.00   527.9±20.09µs        ? ?/sec      1.00   525.4±16.45µs        ? ?/sec
busy_systems/05x_entities_03_systems                     1.02    136.4±5.72µs        ? ?/sec      1.00    133.3±4.60µs        ? ?/sec      1.18   156.7±15.90µs        ? ?/sec
busy_systems/05x_entities_06_systems                     1.00   266.8±12.08µs        ? ?/sec      1.03   273.8±14.66µs        ? ?/sec      1.07   286.1±11.79µs        ? ?/sec
busy_systems/05x_entities_09_systems                     1.03   402.7±13.92µs        ? ?/sec      1.00   391.8±14.55µs        ? ?/sec      1.09   426.6±20.73µs        ? ?/sec
busy_systems/05x_entities_12_systems                     1.01   533.6±21.50µs        ? ?/sec      1.00   528.2±27.11µs        ? ?/sec      1.06   560.0±17.33µs        ? ?/sec
busy_systems/05x_entities_15_systems                     1.02   674.8±27.99µs        ? ?/sec      1.00   664.1±34.89µs        ? ?/sec      1.06   700.7±44.73µs        ? ?/sec
contrived/01x_entities_03_systems                        1.30     27.3±2.90µs        ? ?/sec      1.17     24.6±2.05µs        ? ?/sec      1.00     21.1±1.24µs        ? ?/sec
contrived/01x_entities_06_systems                        1.03     42.8±3.04µs        ? ?/sec      1.11     46.1±4.04µs        ? ?/sec      1.00     41.5±1.58µs        ? ?/sec
contrived/01x_entities_09_systems                        1.01     61.4±4.00µs        ? ?/sec      1.06     64.3±4.45µs        ? ?/sec      1.00     60.9±3.89µs        ? ?/sec
contrived/01x_entities_12_systems                        1.01     81.2±4.80µs        ? ?/sec      1.06     85.3±6.08µs        ? ?/sec      1.00     80.7±3.33µs        ? ?/sec
contrived/01x_entities_15_systems                        1.00     98.1±5.48µs        ? ?/sec      1.11    108.7±7.74µs        ? ?/sec      1.02     99.9±5.96µs        ? ?/sec
contrived/02x_entities_03_systems                        1.08     33.9±2.63µs        ? ?/sec      1.14     35.9±3.57µs        ? ?/sec      1.00     31.5±1.46µs        ? ?/sec
contrived/02x_entities_06_systems                        1.00     60.6±2.18µs        ? ?/sec      1.06     64.0±3.42µs        ? ?/sec      1.05     63.4±2.95µs        ? ?/sec
contrived/02x_entities_09_systems                        1.00     92.0±5.89µs        ? ?/sec      1.04     95.0±2.91µs        ? ?/sec      1.00     91.6±2.73µs        ? ?/sec
contrived/02x_entities_12_systems                        1.06   128.5±11.27µs        ? ?/sec      1.03    124.6±8.30µs        ? ?/sec      1.00    121.3±3.12µs        ? ?/sec
contrived/02x_entities_15_systems                        1.00    151.2±7.89µs        ? ?/sec      1.01    153.3±9.82µs        ? ?/sec      1.02    153.5±7.67µs        ? ?/sec
contrived/03x_entities_03_systems                        1.04     43.9±2.27µs        ? ?/sec      1.00     42.0±2.54µs        ? ?/sec      1.03     43.2±1.46µs        ? ?/sec
contrived/03x_entities_06_systems                        1.05     86.7±5.83µs        ? ?/sec      1.01     83.2±4.75µs        ? ?/sec      1.00     82.3±4.09µs        ? ?/sec
contrived/03x_entities_09_systems                        1.01    125.5±8.38µs        ? ?/sec      1.03    128.5±9.84µs        ? ?/sec      1.00    124.7±5.31µs        ? ?/sec
contrived/03x_entities_12_systems                        1.00    160.4±3.97µs        ? ?/sec      1.05    167.7±8.68µs        ? ?/sec      1.02    164.0±4.07µs        ? ?/sec
contrived/03x_entities_15_systems                        1.03   208.9±12.52µs        ? ?/sec      1.02    206.3±9.65µs        ? ?/sec      1.00    202.8±8.56µs        ? ?/sec
contrived/04x_entities_03_systems                        1.02     54.5±2.76µs        ? ?/sec      1.02     54.1±4.25µs        ? ?/sec      1.00     53.1±4.04µs        ? ?/sec
contrived/04x_entities_06_systems                        1.05    106.4±6.15µs        ? ?/sec      1.00    101.5±3.27µs        ? ?/sec      1.02    103.6±5.81µs        ? ?/sec
contrived/04x_entities_09_systems                        1.04   160.1±10.74µs        ? ?/sec      1.00    153.2±8.36µs        ? ?/sec      1.01    154.7±5.95µs        ? ?/sec
contrived/04x_entities_12_systems                        1.00    205.8±9.71µs        ? ?/sec      1.01    206.5±6.03µs        ? ?/sec      1.00    205.2±8.46µs        ? ?/sec
contrived/04x_entities_15_systems                        1.01    251.5±9.87µs        ? ?/sec      1.08   268.6±11.00µs        ? ?/sec      1.00    248.4±6.81µs        ? ?/sec
contrived/05x_entities_03_systems                        1.00     62.7±3.15µs        ? ?/sec      1.00     62.9±2.73µs        ? ?/sec      1.01     63.4±3.52µs        ? ?/sec
contrived/05x_entities_06_systems                        1.00    126.9±5.49µs        ? ?/sec      1.03    130.6±6.22µs        ? ?/sec      1.00    127.2±6.37µs        ? ?/sec
contrived/05x_entities_09_systems                        1.00    179.5±5.15µs        ? ?/sec      1.05    188.5±8.31µs        ? ?/sec      1.05    187.5±6.80µs        ? ?/sec
contrived/05x_entities_12_systems                        1.00    244.4±7.21µs        ? ?/sec      1.00    245.2±7.75µs        ? ?/sec      1.04   254.6±15.85µs        ? ?/sec
contrived/05x_entities_15_systems                        1.01    312.3±9.83µs        ? ?/sec      1.01   315.1±11.95µs        ? ?/sec      1.00   310.5±10.72µs        ? ?/sec
fragmented_iter/base                                     1.17    478.1±5.89ns        ? ?/sec      1.02   416.4±18.28ns        ? ?/sec      1.00   410.1±18.16ns        ? ?/sec
fragmented_iter/foreach                                  1.00   236.3±25.69ns        ? ?/sec      1.00   236.0±24.93ns        ? ?/sec      1.02   241.6±29.92ns        ? ?/sec
heavy_compute/base                                       1.00    355.8±4.53µs        ? ?/sec      1.01    359.0±5.15µs        ? ?/sec      1.02    364.5±5.74µs        ? ?/sec
insert_commands/insert                                   1.02   783.8±34.26µs        ? ?/sec      1.00   772.2±30.11µs        ? ?/sec      1.00   774.8±33.14µs        ? ?/sec
insert_commands/insert_batch                             1.00   394.6±44.37µs        ? ?/sec      1.03   406.7±39.73µs        ? ?/sec      1.04   410.3±48.26µs        ? ?/sec
query_get/50000_entities_sparse                          1.11   589.0±31.37µs        ? ?/sec      1.00   530.2±36.38µs        ? ?/sec      2.13  1127.2±55.93µs        ? ?/sec
query_get/50000_entities_table                           1.00   457.1±27.07µs        ? ?/sec      1.01    463.0±6.55µs        ? ?/sec      1.32   601.6±12.68µs        ? ?/sec
query_get_component/50000_entities_sparse                1.00  1244.8±50.69µs        ? ?/sec      1.04  1289.7±74.86µs        ? ?/sec      1.03  1287.5±52.23µs        ? ?/sec
query_get_component/50000_entities_table                 1.01  1247.2±108.41µs        ? ?/sec     1.03  1273.3±90.25µs        ? ?/sec      1.00  1236.8±21.18µs        ? ?/sec
schedule/base                                            1.01     30.6±2.49µs        ? ?/sec      1.03     31.2±2.25µs        ? ?/sec      1.00     30.2±1.93µs        ? ?/sec
simple_iter/base                                         1.01     13.9±0.74µs        ? ?/sec      1.00     13.7±0.19µs        ? ?/sec      1.00     13.7±0.17µs        ? ?/sec
simple_iter/foreach                                      1.00     11.6±0.12µs        ? ?/sec      1.00     11.6±0.15µs        ? ?/sec      1.00     11.6±0.18µs        ? ?/sec
simple_iter/sparse                                       1.00     52.0±0.22µs        ? ?/sec      1.00     51.8±0.26µs        ? ?/sec      1.18     61.3±0.32µs        ? ?/sec
simple_iter/sparse_foreach                               1.00     45.2±0.19µs        ? ?/sec      1.04     46.9±0.41µs        ? ?/sec      1.12     50.4±0.76µs        ? ?/sec
simple_iter/system                                       1.00     13.7±0.29µs        ? ?/sec      1.00     13.7±0.07µs        ? ?/sec      1.01     13.8±0.49µs        ? ?/sec
sparse_fragmented_iter/base                              1.00     10.9±0.24ns        ? ?/sec      1.22     13.3±0.62ns        ? ?/sec      1.18     12.8±0.86ns        ? ?/sec
sparse_fragmented_iter/foreach                           1.00      8.9±0.22ns        ? ?/sec      1.00      8.9±0.15ns        ? ?/sec      1.00      8.9±0.14ns        ? ?/sec
world_entity/50000_entities                              1.01    426.6±0.70µs        ? ?/sec      1.00    424.3±1.23µs        ? ?/sec      1.00    424.3±1.15µs        ? ?/sec
world_get/50000_entities_sparse                          1.00    548.5±6.27µs        ? ?/sec      1.04   570.1±12.14µs        ? ?/sec      1.00    548.2±8.05µs        ? ?/sec
world_get/50000_entities_table                           1.00   916.9±13.20µs        ? ?/sec      1.04    951.5±5.31µs        ? ?/sec      1.01    930.5±7.69µs        ? ?/sec
world_query_for_each/50000_entities_sparse               1.03     99.0±1.47µs        ? ?/sec      1.03     99.0±1.27µs        ? ?/sec      1.00     95.8±0.91µs        ? ?/sec
world_query_for_each/50000_entities_table                1.00     27.2±0.24µs        ? ?/sec      1.00     27.2±0.11µs        ? ?/sec      1.00     27.2±0.10µs        ? ?/sec
world_query_get/50000_entities_sparse                    1.29   478.6±11.07µs        ? ?/sec      1.00    372.1±6.47µs        ? ?/sec      1.07   398.2±10.82µs        ? ?/sec
world_query_get/50000_entities_table                     1.06    274.3±4.77µs        ? ?/sec      1.00    259.8±2.52µs        ? ?/sec      1.05    273.4±4.26µs        ? ?/sec
world_query_iter/50000_entities_sparse                   1.16    114.9±0.65µs        ? ?/sec      1.00     99.4±2.03µs        ? ?/sec      1.03    102.8±3.31µs        ? ?/sec
world_query_iter/50000_entities_table                    1.00     27.3±0.78µs        ? ?/sec      1.00     27.3±0.17µs        ? ?/sec      1.00     27.2±0.26µs        ? ?/sec

james7132 avatar May 20 '22 05:05 james7132

@james7132 are the Todo comments from the PR description addressed now?

alice-i-cecile avatar May 30 '22 21:05 alice-i-cecile

@james7132 are the Todo comments from the PR description addressed now?

Yep more or less ready now.

james7132 avatar May 30 '22 21:05 james7132

@bevyengine/ecs-team reviews please!

alice-i-cecile avatar May 30 '22 21:05 alice-i-cecile

Can we do some iter/get/frag_iter benchmarks of larger / more complicated queries? This (potentially) adds a branch to each Fetch impl, instead of branching once for the entire query. These redundant branches might get optimized out, but I intentionally moved that branch out to remove the (logical) O(FETCHED_ITEMS) branches. It will be hard to compare that vs main though, given the other optimizations in this pr.

cart avatar May 30 '22 23:05 cart

It branches on a constant, just like in set_table/set_archetype, so it should mark the unmatched branch unreachable and completely remove it at compile time. Even with just a singular fetched component type, this would have seen significant perf regression if those optimizations were not present. I'll see if I can extend the existing benchmarks to use more components/filters in the queries.

As an alternative, we could make the &T and &mut T Fetch types be reliant on an associated type on Component::Storage and completely remove the need for the internal branch. However, this might be reliant on the removal of FetchState first so that the backing state of all component fetches can be ComponentId, so we could try to do the following:

pub trait ComponentStorage {
   type ReadFetch: for<'a> Fetch<'a, State=ComponentId>;
   type WriteFetch: for<'a> Fetch<'a, State=ComponentId>;
   type ReadOnlyWriteFetch: for<'a> Fetch<'a, State=ComponentId>;
}

impl<'a, T: Component> WorldQueryGats<'a> for &T {
   type ReadFetch = T::Storage::ReadFetch;
   type WriteFetch = T::Storage::ReadFetch;
   type ReadOnlyWriteFetch = T::Storage::ReadOnlyWriteFetch;  
}

impl<'a, T> Fetch<'a> for TableReadFetch<'a, T> {
   ...
}

james7132 avatar May 31 '22 00:05 james7132

Why were so many #[inline] changed to #[inline(always)] did you benchmark this and it improved stuff?

BoxyUwU avatar May 31 '22 13:05 BoxyUwU

Why were so many #[inline] changed to #[inline(always)] did you benchmark this and it improved stuff?

The optimization strategy here strictly relies on having the fetch/filter_fetch calls inlined so that the compiler can discover that one or more of the parameters are not being used. I just didn't want to take chances there, particularly with some of these already having inlined sparse set and table accesses which could make the generated code larger and fall above the inlining threshold. I can test that if need be.

james7132 avatar May 31 '22 16:05 james7132

I have not run a microbenchmark with larger queries, but I know we have some really big ones in rendering and other parts of the engine, so as a sanity check. Tested it against many_cubes which has a mix of both normal iteration and heavy Query::get usage via the render phase. I also tested this PR where it uses Entity and usize directly instead of references. Here are the stage timings for comparison:

stage main this PR this PR (copy over reference)
Full Frame 20.52ms 19.7ms 19.55ms
First 411.71us 401.28us 408.13us
LoadAssets 193.32us 186.95us 189.87us
PreUpdate 95.6us 91.18us 94.98us
Update 53.52us 54.41us 54.47us
PostUpdate 3.32ms 3.05ms 3.09ms
AssetEvents 185.73us 180us 184.05us
Last 27.11us 26.05us 27.29us
Extract 3.7ms 3.69ms 3.31ms
Prepare 2.64ms 2.54ms 2.56ms
Queue 956.72us 911.44us 929.13us
Sort 993.08us 987.89us 984.32us
Render 7.49ms 7.12ms 7.28ms

The biggest wins here are in PostUpdate, which has a heavy parallel iteration via check_visibility and Render, where every visible entity has multiple Query::get calls made. Everything else is likely within the margin of error, but generally don't show any significant regression in perf. For comparison, the primary query that is being run in visible entities query is defined as:

    mut visible_entity_query: Query<(
        Entity,
        &Visibility,
        &mut ComputedVisibility,
        Option<&RenderLayers>,
        Option<&Aabb>,
        Option<&NoFrustumCulling>,
        Option<&GlobalTransform>,
    )>,

This query is running in parallel, so task spawn overhead and contention notwithstanding, this query is ~10% faster with this change.

james7132 avatar Jun 18 '22 06:06 james7132

As for a more detailed explanation of why this seems to work, see #5064. In particular, this removes a bunch of the unwrap_or_else(|| debug_checked_unreachable()) calls,, which are otherwise unavoidable, with get_unchecked calls outside the fetch call. This seems to be adding quite a few more instructions, including two jumps that were otherwise supposed to be optimized out.

james7132 avatar Jun 21 '22 11:06 james7132

@cart, this is tricky but well-motivated, reviewed and benchmarked. Do you want to do a review pass on this?

alice-i-cecile avatar Jun 27 '22 16:06 alice-i-cecile

I want to redo those stage timing measurements. There's been quite a few optimizations merged in since that was last measured, and I'm sure this is still not a regression, but I'd still like to double check before pulling the trigger on this.

james7132 avatar Jun 27 '22 17:06 james7132

Yeah I'd like to do a pass. I'd also still like to see a microbenchmark of large queries with many fetch calls. The microbenchmarks in this pr seems to show that main is "slightly" faster in many cases. If that "slightly" scales with query size, we'll want to weigh the Query::get wins against that cost. We can't do that without "clean" numbers.

The "many cubes" benchmark also seems roughly compatible with the "this regresses fetch for iteration" interpretation. We see nice improvements in some areas, which apparently align with heavy Query::get calls. But then we tend to see small-ish regressions everywhere else.

For the "many cubes" numbers, its hard to say how big the (potential) wins and losses are, because we're constantly interleaving query iteration (which might have regressed for large queries) and query gets (which have good evidence suggesting they got a perf boost).

cart avatar Jun 27 '22 20:06 cart

Redid the many_cubes measurements. Looks to be a net gain across the board here. Included a few of the systems that are strictly iteration bound as well.

stage/system main this PR
First 355.95us 333.55us
LoadAssets 171.93us 158.57us
PreUpdate 92.43us 84.61us
Update 52.09us 50.09us
PostUpdate 2.06ms 1.79ms
AssetEvents 158.36us 151.17us
Last 27.79us 25.95us
Extract 3.54ms 3.47ms
Prepare 2.57ms 2.33ms
Queue 868.45us 810.47us
Sort 218.88us 206.89us
Render 7.56ms 7.29ms
check_visibility 1.3ms 1.22ms
check_visibility par_for_each (1024 entities) 14.53us 14.51us
extract_meshes 1.55ms 1.44ms
extract_visible_components 530.61us 473.62us
prepare_uniform_components 1.13ms 1.05ms
full frame 18.12ms 17.1ms

james7132 avatar Jun 28 '22 02:06 james7132

Redid the microbenchmarks including the ones in #5123. The results are odd. It does indeed show that even the wider queries benefit from this change. However, both of the busy_systems and contrived benchmarks consistently regressed further. I'm not sure if this due to the parallel scheduler in the mix or some other influence, because all of the other iteration/get benchmarks show the non-regression or substantially better results.

Updated Benchmarks
group                                                    cleanup-fetch                            main
-----                                                    -------------                            ----
add_remove_component/sparse_set                          1.02  1322.0±76.98µs        ? ?/sec      1.00  1301.0±82.37µs        ? ?/sec
add_remove_component/table                               1.03  1682.7±49.97µs        ? ?/sec      1.00  1629.8±35.87µs        ? ?/sec
add_remove_component_big/sparse_set                      1.00  1435.7±299.23µs        ? ?/sec     1.03  1476.2±296.28µs        ? ?/sec
add_remove_component_big/table                           1.01      2.9±0.05ms        ? ?/sec      1.00      2.9±0.23ms        ? ?/sec
added_archetypes/archetype_count/100                     1.00   186.2±10.35µs        ? ?/sec      1.00    185.6±9.29µs        ? ?/sec
added_archetypes/archetype_count/1000                    1.00   688.2±20.12µs        ? ?/sec      1.06   728.0±50.14µs        ? ?/sec
added_archetypes/archetype_count/10000                   1.00     14.2±1.39ms        ? ?/sec      1.03     14.6±2.00ms        ? ?/sec
added_archetypes/archetype_count/200                     1.03   234.2±10.67µs        ? ?/sec      1.00   226.6±12.16µs        ? ?/sec
added_archetypes/archetype_count/2000                    1.00  1355.2±33.30µs        ? ?/sec      1.08  1465.4±125.18µs        ? ?/sec
added_archetypes/archetype_count/500                     1.00   403.4±48.12µs        ? ?/sec      1.02   413.3±29.36µs        ? ?/sec
added_archetypes/archetype_count/5000                    1.00      4.8±0.56ms        ? ?/sec      1.15      5.5±0.83ms        ? ?/sec
busy_systems/01x_entities_03_systems                     1.16     40.4±2.18µs        ? ?/sec      1.00     34.9±1.19µs        ? ?/sec
busy_systems/01x_entities_06_systems                     1.17     78.1±2.53µs        ? ?/sec      1.00     66.6±2.78µs        ? ?/sec
busy_systems/01x_entities_09_systems                     1.26    118.6±4.26µs        ? ?/sec      1.00     94.0±2.86µs        ? ?/sec
busy_systems/01x_entities_12_systems                     1.20    147.7±4.45µs        ? ?/sec      1.00    123.1±5.81µs        ? ?/sec
busy_systems/01x_entities_15_systems                     1.16    178.7±4.28µs        ? ?/sec      1.00    153.6±4.88µs        ? ?/sec
busy_systems/02x_entities_03_systems                     1.26     75.5±4.10µs        ? ?/sec      1.00     59.9±2.85µs        ? ?/sec
busy_systems/02x_entities_06_systems                     1.15    137.2±3.85µs        ? ?/sec      1.00    119.0±7.39µs        ? ?/sec
busy_systems/02x_entities_09_systems                     1.27    217.6±6.59µs        ? ?/sec      1.00    171.9±4.08µs        ? ?/sec
busy_systems/02x_entities_12_systems                     1.17    270.3±6.97µs        ? ?/sec      1.00    230.6±8.73µs        ? ?/sec
busy_systems/02x_entities_15_systems                     1.20   334.5±11.35µs        ? ?/sec      1.00    278.8±6.34µs        ? ?/sec
busy_systems/03x_entities_03_systems                     1.02    102.5±4.59µs        ? ?/sec      1.00    100.3±4.70µs        ? ?/sec
busy_systems/03x_entities_06_systems                     1.15    201.1±8.64µs        ? ?/sec      1.00    174.8±5.38µs        ? ?/sec
busy_systems/03x_entities_09_systems                     1.30   323.3±13.82µs        ? ?/sec      1.00    247.9±6.90µs        ? ?/sec
busy_systems/03x_entities_12_systems                     1.21   389.5±14.12µs        ? ?/sec      1.00    320.9±9.25µs        ? ?/sec
busy_systems/03x_entities_15_systems                     1.18   482.4±12.17µs        ? ?/sec      1.00    407.1±8.99µs        ? ?/sec
busy_systems/04x_entities_03_systems                     1.25    138.9±7.21µs        ? ?/sec      1.00    111.1±4.86µs        ? ?/sec
busy_systems/04x_entities_06_systems                     1.22   273.1±12.83µs        ? ?/sec      1.00    223.3±6.85µs        ? ?/sec
busy_systems/04x_entities_09_systems                     1.24   416.1±12.86µs        ? ?/sec      1.00   336.6±12.67µs        ? ?/sec
busy_systems/04x_entities_12_systems                     1.22   513.7±16.52µs        ? ?/sec      1.00   421.3±12.56µs        ? ?/sec
busy_systems/04x_entities_15_systems                     1.16   615.1±26.75µs        ? ?/sec      1.00   532.2±15.00µs        ? ?/sec
busy_systems/05x_entities_03_systems                     1.32   176.5±10.36µs        ? ?/sec      1.00    133.6±4.76µs        ? ?/sec
busy_systems/05x_entities_06_systems                     1.40   366.8±14.17µs        ? ?/sec      1.00    262.7±8.66µs        ? ?/sec
busy_systems/05x_entities_09_systems                     1.25   514.7±20.64µs        ? ?/sec      1.00   410.9±12.45µs        ? ?/sec
busy_systems/05x_entities_12_systems                     1.20   659.0±22.17µs        ? ?/sec      1.00   547.0±13.86µs        ? ?/sec
busy_systems/05x_entities_15_systems                     1.27   838.4±30.31µs        ? ?/sec      1.00   659.7±18.72µs        ? ?/sec
contrived/01x_entities_03_systems                        1.21     27.2±0.52µs        ? ?/sec      1.00     22.5±1.62µs        ? ?/sec
contrived/01x_entities_06_systems                        1.25     52.8±1.32µs        ? ?/sec      1.00     42.4±2.02µs        ? ?/sec
contrived/01x_entities_09_systems                        1.21     75.1±2.46µs        ? ?/sec      1.00     62.0±2.42µs        ? ?/sec
contrived/01x_entities_12_systems                        1.21     98.9±1.88µs        ? ?/sec      1.00     81.6±4.21µs        ? ?/sec
contrived/01x_entities_15_systems                        1.25    125.8±3.28µs        ? ?/sec      1.00    100.6±6.48µs        ? ?/sec
contrived/02x_entities_03_systems                        1.37     45.4±1.72µs        ? ?/sec      1.00     33.1±2.56µs        ? ?/sec
contrived/02x_entities_06_systems                        1.21     79.0±1.47µs        ? ?/sec      1.00     65.2±4.73µs        ? ?/sec
contrived/02x_entities_09_systems                        1.22    116.3±2.60µs        ? ?/sec      1.00     95.1±4.00µs        ? ?/sec
contrived/02x_entities_12_systems                        1.22    152.8±4.98µs        ? ?/sec      1.00    125.2±3.68µs        ? ?/sec
contrived/02x_entities_15_systems                        1.23    186.0±3.24µs        ? ?/sec      1.00    150.7±7.38µs        ? ?/sec
contrived/03x_entities_03_systems                        1.31     54.8±1.61µs        ? ?/sec      1.00     41.9±2.10µs        ? ?/sec
contrived/03x_entities_06_systems                        1.28    105.3±2.34µs        ? ?/sec      1.00     82.0±2.44µs        ? ?/sec
contrived/03x_entities_09_systems                        1.25    157.9±3.23µs        ? ?/sec      1.00    126.7±5.21µs        ? ?/sec
contrived/03x_entities_12_systems                        1.17    196.6±5.02µs        ? ?/sec      1.00    167.6±5.79µs        ? ?/sec
contrived/03x_entities_15_systems                        1.13    238.8±5.80µs        ? ?/sec      1.00   212.2±10.71µs        ? ?/sec
contrived/04x_entities_03_systems                        1.39     73.1±2.13µs        ? ?/sec      1.00     52.7±2.80µs        ? ?/sec
contrived/04x_entities_06_systems                        1.20    133.8±3.18µs        ? ?/sec      1.00    111.8±8.56µs        ? ?/sec
contrived/04x_entities_09_systems                        1.23    189.2±5.33µs        ? ?/sec      1.00    154.0±4.34µs        ? ?/sec
contrived/04x_entities_12_systems                        1.17    241.6±5.61µs        ? ?/sec      1.00    206.2±8.28µs        ? ?/sec
contrived/04x_entities_15_systems                        1.12    295.4±7.79µs        ? ?/sec      1.00    262.9±9.39µs        ? ?/sec
contrived/05x_entities_03_systems                        1.38     84.0±2.42µs        ? ?/sec      1.00     60.9±1.98µs        ? ?/sec
contrived/05x_entities_06_systems                        1.32    159.1±3.13µs        ? ?/sec      1.00    120.1±2.42µs        ? ?/sec
contrived/05x_entities_09_systems                        1.32    241.3±5.63µs        ? ?/sec      1.00    182.5±5.08µs        ? ?/sec
contrived/05x_entities_12_systems                        1.20   299.3±12.07µs        ? ?/sec      1.00   248.8±10.79µs        ? ?/sec
contrived/05x_entities_15_systems                        1.18    362.3±8.39µs        ? ?/sec      1.00   306.5±16.78µs        ? ?/sec
empty_commands/0_entities                                1.00      5.2±0.27ns        ? ?/sec      1.00      5.2±0.27ns        ? ?/sec
fake_commands/2000_commands                              1.06      7.2±0.23µs        ? ?/sec      1.00      6.9±0.15µs        ? ?/sec
fake_commands/4000_commands                              1.13     15.2±0.62µs        ? ?/sec      1.00     13.4±0.12µs        ? ?/sec
fake_commands/6000_commands                              1.13     22.6±0.80µs        ? ?/sec      1.00     20.0±0.15µs        ? ?/sec
fake_commands/8000_commands                              1.07     28.8±0.29µs        ? ?/sec      1.00     26.8±0.19µs        ? ?/sec
fragmented_iter/base                                     1.00   352.2±10.92ns        ? ?/sec      1.34   470.4±31.26ns        ? ?/sec
fragmented_iter/foreach                                  1.01   245.8±26.37ns        ? ?/sec      1.00   242.8±23.09ns        ? ?/sec
fragmented_iter/foreach_wide                             1.00      4.0±0.23µs        ? ?/sec      1.02      4.1±0.54µs        ? ?/sec
fragmented_iter/wide                                     1.00      4.5±0.19µs        ? ?/sec      1.34      6.1±0.21µs        ? ?/sec
get_component/base                                       1.00  1051.9±13.21µs        ? ?/sec      1.06  1112.2±43.40µs        ? ?/sec
get_component/system                                     1.00   763.4±33.24µs        ? ?/sec      1.06   806.3±22.00µs        ? ?/sec
get_or_spawn/batched                                     1.00   419.0±53.91µs        ? ?/sec      1.01   421.2±47.27µs        ? ?/sec
get_or_spawn/individual                                  1.01   948.7±77.17µs        ? ?/sec      1.00   937.8±78.09µs        ? ?/sec
heavy_compute/base                                       1.01    361.4±3.93µs        ? ?/sec      1.00    357.9±3.21µs        ? ?/sec
insert_commands/insert                                   1.00   800.2±31.11µs        ? ?/sec      1.04   830.3±99.12µs        ? ?/sec
insert_commands/insert_batch                             1.00   403.5±40.45µs        ? ?/sec      1.02   410.4±38.46µs        ? ?/sec
query_get/50000_entities_sparse                          1.00   643.1±45.04µs        ? ?/sec      1.99  1280.2±34.58µs        ? ?/sec
query_get/50000_entities_table                           1.00   577.9±10.12µs        ? ?/sec      1.28   737.7±12.53µs        ? ?/sec
query_get_component/50000_entities_sparse                1.01  1225.8±95.93µs        ? ?/sec      1.00  1209.1±65.14µs        ? ?/sec
query_get_component/50000_entities_table                 1.04  1244.6±112.45µs        ? ?/sec     1.00  1192.1±17.73µs        ? ?/sec
simple_insert/base                                       1.08   619.4±94.59µs        ? ?/sec      1.00   576.0±19.23µs        ? ?/sec
simple_insert/unbatched                                  1.00  1407.8±60.24µs        ? ?/sec      1.01  1419.9±33.00µs        ? ?/sec
simple_iter/base                                         1.00     11.0±0.06µs        ? ?/sec      1.25     13.7±0.10µs        ? ?/sec
simple_iter/foreach                                      1.01     10.9±0.06µs        ? ?/sec      1.00     10.8±0.10µs        ? ?/sec
simple_iter/foreach_wide                                 1.00     44.2±0.66µs        ? ?/sec      1.10     48.4±4.06µs        ? ?/sec
simple_iter/sparse                                       1.00     47.8±0.44µs        ? ?/sec      1.15     54.8±0.58µs        ? ?/sec
simple_iter/sparse_foreach                               1.00     43.7±2.89µs        ? ?/sec      1.12     49.1±0.33µs        ? ?/sec
simple_iter/sparse_foreach_wide                          1.00    241.9±3.14µs        ? ?/sec      1.08    262.1±4.61µs        ? ?/sec
simple_iter/sparse_wide                                  1.00    252.6±2.16µs        ? ?/sec      1.11    281.6±4.89µs        ? ?/sec
simple_iter/system                                       1.00     11.0±0.13µs        ? ?/sec      1.25     13.7±0.15µs        ? ?/sec
simple_iter/wide                                         1.00     55.5±0.22µs        ? ?/sec      1.16     64.3±0.52µs        ? ?/sec
sized_commands_0_bytes/2000_commands                     1.13      5.1±0.08µs        ? ?/sec      1.00      4.5±0.03µs        ? ?/sec
sized_commands_0_bytes/4000_commands                     1.11     10.2±0.09µs        ? ?/sec      1.00      9.1±0.06µs        ? ?/sec
sized_commands_0_bytes/6000_commands                     1.13     15.4±0.20µs        ? ?/sec      1.00     13.7±0.11µs        ? ?/sec
sized_commands_0_bytes/8000_commands                     1.12     20.4±0.28µs        ? ?/sec      1.00     18.3±0.22µs        ? ?/sec
sized_commands_12_bytes/2000_commands                    1.00      7.1±0.06µs        ? ?/sec      1.00      7.1±0.04µs        ? ?/sec
sized_commands_12_bytes/4000_commands                    1.00     14.4±0.07µs        ? ?/sec      1.01     14.6±0.23µs        ? ?/sec
sized_commands_12_bytes/6000_commands                    1.01     22.1±0.33µs        ? ?/sec      1.00     21.8±0.31µs        ? ?/sec
sized_commands_12_bytes/8000_commands                    1.00     28.9±0.40µs        ? ?/sec      1.00     28.9±0.44µs        ? ?/sec
sized_commands_512_bytes/2000_commands                   1.00    108.9±2.50µs        ? ?/sec      1.02    110.8±2.61µs        ? ?/sec
sized_commands_512_bytes/4000_commands                   1.00   224.9±23.23µs        ? ?/sec      1.01   226.1±12.77µs        ? ?/sec
sized_commands_512_bytes/6000_commands                   1.00   344.3±44.37µs        ? ?/sec      1.01   347.7±45.95µs        ? ?/sec
sized_commands_512_bytes/8000_commands                   1.01   471.0±74.12µs        ? ?/sec      1.00   466.3±58.84µs        ? ?/sec
sparse_fragmented_iter/base                              1.02     11.7±1.03ns        ? ?/sec      1.00     11.6±0.57ns        ? ?/sec
sparse_fragmented_iter/foreach                           1.00      9.0±0.32ns        ? ?/sec      1.03      9.2±0.24ns        ? ?/sec
sparse_fragmented_iter/foreach_wide                      1.00     42.1±1.16ns        ? ?/sec      1.06     44.5±5.25ns        ? ?/sec
sparse_fragmented_iter/wide                              1.00     72.8±5.68ns        ? ?/sec      1.18     85.9±5.85ns        ? ?/sec
spawn_commands/2000_entities                             1.06   275.0±38.92µs        ? ?/sec      1.00   259.1±22.15µs        ? ?/sec
spawn_commands/4000_entities                             1.00   518.1±42.46µs        ? ?/sec      1.00   518.6±30.19µs        ? ?/sec
spawn_commands/6000_entities                             1.03   777.7±83.67µs        ? ?/sec      1.00   753.6±54.53µs        ? ?/sec
spawn_commands/8000_entities                             1.01  1003.0±97.98µs        ? ?/sec      1.00   996.2±96.49µs        ? ?/sec
world_entity/50000_entities                              1.00    425.2±2.51µs        ? ?/sec      1.01    427.5±0.97µs        ? ?/sec
world_get/50000_entities_sparse                          1.00    562.1±9.75µs        ? ?/sec      1.03    578.7±5.37µs        ? ?/sec
world_get/50000_entities_table                           1.00   911.2±15.13µs        ? ?/sec      1.03    943.0±8.83µs        ? ?/sec
world_query_for_each/50000_entities_sparse               1.00     84.5±0.66µs        ? ?/sec      1.00     84.1±2.73µs        ? ?/sec
world_query_for_each/50000_entities_table                1.00     27.2±0.14µs        ? ?/sec      1.00     27.2±0.13µs        ? ?/sec
world_query_get/50000_entities_sparse                    1.00    464.8±8.55µs        ? ?/sec      1.04    483.1±4.63µs        ? ?/sec
world_query_get/50000_entities_sparse_wide               1.00  1420.2±53.48µs        ? ?/sec      1.07  1518.6±28.91µs        ? ?/sec
world_query_get/50000_entities_table                     1.00    402.2±3.20µs        ? ?/sec      1.09    437.5±4.37µs        ? ?/sec
world_query_get/50000_entities_table_wide                1.01    811.5±6.02µs        ? ?/sec      1.00    804.2±8.17µs        ? ?/sec
world_query_iter/50000_entities_sparse                   1.01     97.9±8.14µs        ? ?/sec      1.00     96.6±1.41µs        ? ?/sec
world_query_iter/50000_entities_table                    1.00     27.2±0.45µs        ? ?/sec      1.01     27.5±0.42µs        ? ?/sec

james7132 avatar Jun 29 '22 05:06 james7132

Further cross checking this against the results for before and after switching to values over references 4d27afc, the change here does seem to line up, and it's the only other real substantive change since the last microbenchmark. Rerunning the microbenchmarks with that change reverted.

james7132 avatar Jun 29 '22 05:06 james7132

Completed both benchmarks and a timing test against many_cubes and it does seem like it's coming from that change. The stage timings seem to suggest that Query::get is indeed faster with copied values over references, which is making Render faster (it's where the 0.3ms difference is likely coming from), but the microbenchmark seems to show the aforementioned regression. I think we should stick with the current pass-by-value results given the more practical stage timings, but I'll leave the decision up to @cart as to which one we should trust.

`many_cubes` stage timings
stage/system main this PR this PR (reference)
First 355.95us 344.28us 342.84us
LoadAssets 171.93us 163.25us 163.09us
PreUpdate 92.43us 85.53us 84.4us
Update 52.09us 50.1us 48.45us
PostUpdate 2.06ms 1.83ms 1.83ms
AssetEvents 158.36us 153.82us 153.35us
Last 27.79us 27.83us 25.3us
Extract 3.54ms 3.4ms 3.43ms
Prepare 2.57ms 2.34ms 2.31ms
Queue 868.45us 825.1us 827.37us
Sort 218.88us 209.41us 206.23us
Render 7.56ms 7.27ms 7.59ms
check_visibility 1.3ms 1.24ms 1.25ms
check_visibility par_for_each (1024 entities) 14.53us 14.59us 14.31us
extract_meshes 1.55ms 1.48ms 1.42ms
extract_visible_components 530.61us 478.62us 469.55us
prepare_uniform_components 1.13ms 1.03ms 1.04ms
full frame 18.12ms 17.08ms 17.42ms
Microbenchmark Results
group                                                    cleanup-fetch                            cleanup-fetch-reference                  main
-----                                                    -------------                            -----------------------                  ----
add_remove_component/sparse_set                          1.02  1322.0±76.98µs        ? ?/sec      1.01  1309.9±74.91µs        ? ?/sec      1.00  1301.0±82.37µs        ? ?/sec
add_remove_component/table                               1.03  1682.7±49.97µs        ? ?/sec      1.05  1711.4±52.17µs        ? ?/sec      1.00  1629.8±35.87µs        ? ?/sec
add_remove_component_big/sparse_set                      1.00  1435.7±299.23µs        ? ?/sec     1.01  1453.0±256.37µs        ? ?/sec     1.03  1476.2±296.28µs        ? ?/sec
add_remove_component_big/table                           1.01      2.9±0.05ms        ? ?/sec      1.01      2.9±0.19ms        ? ?/sec      1.00      2.9±0.23ms        ? ?/sec
added_archetypes/archetype_count/100                     1.10   186.2±10.35µs        ? ?/sec      1.00   169.6±12.51µs        ? ?/sec      1.09    185.6±9.29µs        ? ?/sec
added_archetypes/archetype_count/1000                    1.00   688.2±20.12µs        ? ?/sec      1.04   714.3±15.19µs        ? ?/sec      1.06   728.0±50.14µs        ? ?/sec
added_archetypes/archetype_count/10000                   1.05     14.2±1.39ms        ? ?/sec      1.00     13.6±1.15ms        ? ?/sec      1.07     14.6±2.00ms        ? ?/sec
added_archetypes/archetype_count/200                     1.03   234.2±10.67µs        ? ?/sec      1.05   236.9±10.99µs        ? ?/sec      1.00   226.6±12.16µs        ? ?/sec
added_archetypes/archetype_count/2000                    1.00  1355.2±33.30µs        ? ?/sec      1.03  1391.9±51.10µs        ? ?/sec      1.08  1465.4±125.18µs        ? ?/sec
added_archetypes/archetype_count/500                     1.00   403.4±48.12µs        ? ?/sec      1.03   414.0±10.68µs        ? ?/sec      1.02   413.3±29.36µs        ? ?/sec
added_archetypes/archetype_count/5000                    1.01      4.8±0.56ms        ? ?/sec      1.00      4.8±0.37ms        ? ?/sec      1.16      5.5±0.83ms        ? ?/sec
busy_systems/01x_entities_03_systems                     1.27     40.4±2.18µs        ? ?/sec      1.00     31.9±1.04µs        ? ?/sec      1.09     34.9±1.19µs        ? ?/sec
busy_systems/01x_entities_06_systems                     1.19     78.1±2.53µs        ? ?/sec      1.00     65.7±2.16µs        ? ?/sec      1.01     66.6±2.78µs        ? ?/sec
busy_systems/01x_entities_09_systems                     1.26    118.6±4.26µs        ? ?/sec      1.06     99.3±2.69µs        ? ?/sec      1.00     94.0±2.86µs        ? ?/sec
busy_systems/01x_entities_12_systems                     1.20    147.7±4.45µs        ? ?/sec      1.03    127.0±6.14µs        ? ?/sec      1.00    123.1±5.81µs        ? ?/sec
busy_systems/01x_entities_15_systems                     1.16    178.7±4.28µs        ? ?/sec      1.03    158.3±5.73µs        ? ?/sec      1.00    153.6±4.88µs        ? ?/sec
busy_systems/02x_entities_03_systems                     1.26     75.5±4.10µs        ? ?/sec      1.04     62.2±3.28µs        ? ?/sec      1.00     59.9±2.85µs        ? ?/sec
busy_systems/02x_entities_06_systems                     1.21    137.2±3.85µs        ? ?/sec      1.00    113.2±4.10µs        ? ?/sec      1.05    119.0±7.39µs        ? ?/sec
busy_systems/02x_entities_09_systems                     1.30    217.6±6.59µs        ? ?/sec      1.00    167.0±4.14µs        ? ?/sec      1.03    171.9±4.08µs        ? ?/sec
busy_systems/02x_entities_12_systems                     1.18    270.3±6.97µs        ? ?/sec      1.00   228.4±10.32µs        ? ?/sec      1.01    230.6±8.73µs        ? ?/sec
busy_systems/02x_entities_15_systems                     1.20   334.5±11.35µs        ? ?/sec      1.02    283.8±7.82µs        ? ?/sec      1.00    278.8±6.34µs        ? ?/sec
busy_systems/03x_entities_03_systems                     1.17    102.5±4.59µs        ? ?/sec      1.00     88.0±5.52µs        ? ?/sec      1.14    100.3±4.70µs        ? ?/sec
busy_systems/03x_entities_06_systems                     1.21    201.1±8.64µs        ? ?/sec      1.00    166.1±5.74µs        ? ?/sec      1.05    174.8±5.38µs        ? ?/sec
busy_systems/03x_entities_09_systems                     1.30   323.3±13.82µs        ? ?/sec      1.01    250.8±7.15µs        ? ?/sec      1.00    247.9±6.90µs        ? ?/sec
busy_systems/03x_entities_12_systems                     1.21   389.5±14.12µs        ? ?/sec      1.05   336.7±11.48µs        ? ?/sec      1.00    320.9±9.25µs        ? ?/sec
busy_systems/03x_entities_15_systems                     1.18   482.4±12.17µs        ? ?/sec      1.00    408.0±9.53µs        ? ?/sec      1.00    407.1±8.99µs        ? ?/sec
busy_systems/04x_entities_03_systems                     1.25    138.9±7.21µs        ? ?/sec      1.03    114.1±3.90µs        ? ?/sec      1.00    111.1±4.86µs        ? ?/sec
busy_systems/04x_entities_06_systems                     1.22   273.1±12.83µs        ? ?/sec      1.01   225.6±12.09µs        ? ?/sec      1.00    223.3±6.85µs        ? ?/sec
busy_systems/04x_entities_09_systems                     1.27   416.1±12.86µs        ? ?/sec      1.00   327.7±11.90µs        ? ?/sec      1.03   336.6±12.67µs        ? ?/sec
busy_systems/04x_entities_12_systems                     1.22   513.7±16.52µs        ? ?/sec      1.04   439.9±15.52µs        ? ?/sec      1.00   421.3±12.56µs        ? ?/sec
busy_systems/04x_entities_15_systems                     1.16   615.1±26.75µs        ? ?/sec      1.04   551.2±23.65µs        ? ?/sec      1.00   532.2±15.00µs        ? ?/sec
busy_systems/05x_entities_03_systems                     1.32   176.5±10.36µs        ? ?/sec      1.17    156.4±8.72µs        ? ?/sec      1.00    133.6±4.76µs        ? ?/sec
busy_systems/05x_entities_06_systems                     1.40   366.8±14.17µs        ? ?/sec      1.06   279.3±14.23µs        ? ?/sec      1.00    262.7±8.66µs        ? ?/sec
busy_systems/05x_entities_09_systems                     1.25   514.7±20.64µs        ? ?/sec      1.03   422.6±29.00µs        ? ?/sec      1.00   410.9±12.45µs        ? ?/sec
busy_systems/05x_entities_12_systems                     1.20   659.0±22.17µs        ? ?/sec      1.02   556.9±22.44µs        ? ?/sec      1.00   547.0±13.86µs        ? ?/sec
busy_systems/05x_entities_15_systems                     1.27   838.4±30.31µs        ? ?/sec      1.02   675.9±21.57µs        ? ?/sec      1.00   659.7±18.72µs        ? ?/sec
contrived/01x_entities_03_systems                        1.21     27.2±0.52µs        ? ?/sec      1.04     23.5±1.38µs        ? ?/sec      1.00     22.5±1.62µs        ? ?/sec
contrived/01x_entities_06_systems                        1.25     52.8±1.32µs        ? ?/sec      1.09     46.4±3.15µs        ? ?/sec      1.00     42.4±2.02µs        ? ?/sec
contrived/01x_entities_09_systems                        1.21     75.1±2.46µs        ? ?/sec      1.07     66.4±2.19µs        ? ?/sec      1.00     62.0±2.42µs        ? ?/sec
contrived/01x_entities_12_systems                        1.21     98.9±1.88µs        ? ?/sec      1.06     86.7±5.42µs        ? ?/sec      1.00     81.6±4.21µs        ? ?/sec
contrived/01x_entities_15_systems                        1.25    125.8±3.28µs        ? ?/sec      1.02    103.0±5.06µs        ? ?/sec      1.00    100.6±6.48µs        ? ?/sec
contrived/02x_entities_03_systems                        1.37     45.4±1.72µs        ? ?/sec      1.09     36.1±1.23µs        ? ?/sec      1.00     33.1±2.56µs        ? ?/sec
contrived/02x_entities_06_systems                        1.21     79.0±1.47µs        ? ?/sec      1.05     68.2±3.08µs        ? ?/sec      1.00     65.2±4.73µs        ? ?/sec
contrived/02x_entities_09_systems                        1.22    116.3±2.60µs        ? ?/sec      1.06    101.3±4.39µs        ? ?/sec      1.00     95.1±4.00µs        ? ?/sec
contrived/02x_entities_12_systems                        1.22    152.8±4.98µs        ? ?/sec      1.03    128.7±9.98µs        ? ?/sec      1.00    125.2±3.68µs        ? ?/sec
contrived/02x_entities_15_systems                        1.23    186.0±3.24µs        ? ?/sec      1.05    157.7±9.87µs        ? ?/sec      1.00    150.7±7.38µs        ? ?/sec
contrived/03x_entities_03_systems                        1.31     54.8±1.61µs        ? ?/sec      1.03     43.0±3.19µs        ? ?/sec      1.00     41.9±2.10µs        ? ?/sec
contrived/03x_entities_06_systems                        1.28    105.3±2.34µs        ? ?/sec      1.09     89.5±4.53µs        ? ?/sec      1.00     82.0±2.44µs        ? ?/sec
contrived/03x_entities_09_systems                        1.25    157.9±3.23µs        ? ?/sec      1.12    142.2±2.78µs        ? ?/sec      1.00    126.7±5.21µs        ? ?/sec
contrived/03x_entities_12_systems                        1.17    196.6±5.02µs        ? ?/sec      1.09    182.7±7.39µs        ? ?/sec      1.00    167.6±5.79µs        ? ?/sec
contrived/03x_entities_15_systems                        1.13    238.8±5.80µs        ? ?/sec      1.00    211.0±4.92µs        ? ?/sec      1.01   212.2±10.71µs        ? ?/sec
contrived/04x_entities_03_systems                        1.39     73.1±2.13µs        ? ?/sec      1.10     58.1±2.11µs        ? ?/sec      1.00     52.7±2.80µs        ? ?/sec
contrived/04x_entities_06_systems                        1.23    133.8±3.18µs        ? ?/sec      1.00    109.1±8.09µs        ? ?/sec      1.03    111.8±8.56µs        ? ?/sec
contrived/04x_entities_09_systems                        1.23    189.2±5.33µs        ? ?/sec      1.04    160.0±5.09µs        ? ?/sec      1.00    154.0±4.34µs        ? ?/sec
contrived/04x_entities_12_systems                        1.17    241.6±5.61µs        ? ?/sec      1.04    214.5±6.91µs        ? ?/sec      1.00    206.2±8.28µs        ? ?/sec
contrived/04x_entities_15_systems                        1.12    295.4±7.79µs        ? ?/sec      1.02    268.6±5.91µs        ? ?/sec      1.00    262.9±9.39µs        ? ?/sec
contrived/05x_entities_03_systems                        1.38     84.0±2.42µs        ? ?/sec      1.07     65.1±3.21µs        ? ?/sec      1.00     60.9±1.98µs        ? ?/sec
contrived/05x_entities_06_systems                        1.32    159.1±3.13µs        ? ?/sec      1.07    128.8±5.30µs        ? ?/sec      1.00    120.1±2.42µs        ? ?/sec
contrived/05x_entities_09_systems                        1.32    241.3±5.63µs        ? ?/sec      1.04    190.3±6.25µs        ? ?/sec      1.00    182.5±5.08µs        ? ?/sec
contrived/05x_entities_12_systems                        1.20   299.3±12.07µs        ? ?/sec      1.01    251.3±6.89µs        ? ?/sec      1.00   248.8±10.79µs        ? ?/sec
contrived/05x_entities_15_systems                        1.18    362.3±8.39µs        ? ?/sec      1.00    306.2±9.39µs        ? ?/sec      1.00   306.5±16.78µs        ? ?/sec
fake_commands/2000_commands                              1.06      7.2±0.23µs        ? ?/sec      1.10      7.6±0.10µs        ? ?/sec      1.00      6.9±0.15µs        ? ?/sec
fake_commands/4000_commands                              1.13     15.2±0.62µs        ? ?/sec      1.08     14.4±0.15µs        ? ?/sec      1.00     13.4±0.12µs        ? ?/sec
fake_commands/6000_commands                              1.13     22.6±0.80µs        ? ?/sec      1.09     21.7±0.27µs        ? ?/sec      1.00     20.0±0.15µs        ? ?/sec
fake_commands/8000_commands                              1.07     28.8±0.29µs        ? ?/sec      1.08     28.9±0.48µs        ? ?/sec      1.00     26.8±0.19µs        ? ?/sec
fragmented_iter/base                                     1.00   352.2±10.92ns        ? ?/sec      1.00   350.7±11.30ns        ? ?/sec      1.34   470.4±31.26ns        ? ?/sec
fragmented_iter/foreach                                  1.02   245.8±26.37ns        ? ?/sec      1.00   241.9±23.40ns        ? ?/sec      1.00   242.8±23.09ns        ? ?/sec
fragmented_iter/foreach_wide                             1.01      4.0±0.23µs        ? ?/sec      1.00      4.0±0.13µs        ? ?/sec      1.03      4.1±0.54µs        ? ?/sec
fragmented_iter/wide                                     1.02      4.5±0.19µs        ? ?/sec      1.00      4.5±0.10µs        ? ?/sec      1.37      6.1±0.21µs        ? ?/sec
get_component/base                                       1.00  1051.9±13.21µs        ? ?/sec      1.02  1070.2±32.42µs        ? ?/sec      1.06  1112.2±43.40µs        ? ?/sec
get_component/system                                     1.01   763.4±33.24µs        ? ?/sec      1.00    753.0±7.72µs        ? ?/sec      1.07   806.3±22.00µs        ? ?/sec
get_or_spawn/batched                                     1.01   419.0±53.91µs        ? ?/sec      1.00   414.8±37.75µs        ? ?/sec      1.02   421.2±47.27µs        ? ?/sec
get_or_spawn/individual                                  1.03   948.7±77.17µs        ? ?/sec      1.00   922.5±69.75µs        ? ?/sec      1.02   937.8±78.09µs        ? ?/sec
heavy_compute/base                                       1.01    361.4±3.93µs        ? ?/sec      1.00    356.5±2.61µs        ? ?/sec      1.00    357.9±3.21µs        ? ?/sec
insert_commands/insert                                   1.03   800.2±31.11µs        ? ?/sec      1.00   775.3±34.02µs        ? ?/sec      1.07   830.3±99.12µs        ? ?/sec
insert_commands/insert_batch                             1.00   403.5±40.45µs        ? ?/sec      1.00   401.7±30.43µs        ? ?/sec      1.02   410.4±38.46µs        ? ?/sec
query_get/50000_entities_sparse                          1.00   643.1±45.04µs        ? ?/sec      1.01   647.3±56.03µs        ? ?/sec      1.99  1280.2±34.58µs        ? ?/sec
query_get/50000_entities_table                           1.03   577.9±10.12µs        ? ?/sec      1.00   558.9±14.07µs        ? ?/sec      1.32   737.7±12.53µs        ? ?/sec
query_get_component/50000_entities_sparse                1.02  1225.8±95.93µs        ? ?/sec      1.00  1205.9±41.28µs        ? ?/sec      1.00  1209.1±65.14µs        ? ?/sec
query_get_component/50000_entities_table                 1.04  1244.6±112.45µs        ? ?/sec     1.01  1205.1±21.94µs        ? ?/sec      1.00  1192.1±17.73µs        ? ?/sec
simple_iter/base                                         1.00     11.0±0.06µs        ? ?/sec      1.03     11.3±0.20µs        ? ?/sec      1.25     13.7±0.10µs        ? ?/sec
simple_iter/foreach                                      1.01     10.9±0.06µs        ? ?/sec      1.01     10.9±0.11µs        ? ?/sec      1.00     10.8±0.10µs        ? ?/sec
simple_iter/foreach_wide                                 1.00     44.2±0.66µs        ? ?/sec      1.00     44.1±0.15µs        ? ?/sec      1.10     48.4±4.06µs        ? ?/sec
simple_iter/sparse                                       1.00     47.8±0.44µs        ? ?/sec      1.00     47.6±0.55µs        ? ?/sec      1.15     54.8±0.58µs        ? ?/sec
simple_iter/sparse_foreach                               1.01     43.7±2.89µs        ? ?/sec      1.00     43.0±0.51µs        ? ?/sec      1.14     49.1±0.33µs        ? ?/sec
simple_iter/sparse_foreach_wide                          1.00    241.9±3.14µs        ? ?/sec      1.00    241.9±1.55µs        ? ?/sec      1.08    262.1±4.61µs        ? ?/sec
simple_iter/sparse_wide                                  1.00    252.6±2.16µs        ? ?/sec      1.04   263.6±59.57µs        ? ?/sec      1.11    281.6±4.89µs        ? ?/sec
simple_iter/system                                       1.00     11.0±0.13µs        ? ?/sec      1.00     11.0±0.14µs        ? ?/sec      1.25     13.7±0.15µs        ? ?/sec
simple_iter/wide                                         1.01     55.5±0.22µs        ? ?/sec      1.00     54.8±0.34µs        ? ?/sec      1.17     64.3±0.52µs        ? ?/sec
sized_commands_0_bytes/2000_commands                     1.13      5.1±0.08µs        ? ?/sec      1.11      5.1±0.04µs        ? ?/sec      1.00      4.5±0.03µs        ? ?/sec
sized_commands_0_bytes/4000_commands                     1.11     10.2±0.09µs        ? ?/sec      1.11     10.1±0.25µs        ? ?/sec      1.00      9.1±0.06µs        ? ?/sec
sized_commands_0_bytes/6000_commands                     1.13     15.4±0.20µs        ? ?/sec      1.11     15.2±0.05µs        ? ?/sec      1.00     13.7±0.11µs        ? ?/sec
sized_commands_0_bytes/8000_commands                     1.12     20.4±0.28µs        ? ?/sec      1.11     20.3±0.08µs        ? ?/sec      1.00     18.3±0.22µs        ? ?/sec
sized_commands_12_bytes/2000_commands                    1.00      7.1±0.06µs        ? ?/sec      1.00      7.1±0.03µs        ? ?/sec      1.00      7.1±0.04µs        ? ?/sec
sized_commands_12_bytes/4000_commands                    1.00     14.4±0.07µs        ? ?/sec      1.00     14.5±0.06µs        ? ?/sec      1.01     14.6±0.23µs        ? ?/sec
sized_commands_12_bytes/6000_commands                    1.02     22.1±0.33µs        ? ?/sec      1.00     21.7±0.09µs        ? ?/sec      1.00     21.8±0.31µs        ? ?/sec
sized_commands_12_bytes/8000_commands                    1.01     28.9±0.40µs        ? ?/sec      1.00     28.8±0.11µs        ? ?/sec      1.00     28.9±0.44µs        ? ?/sec
sized_commands_512_bytes/2000_commands                   1.07    108.9±2.50µs        ? ?/sec      1.00    101.8±3.16µs        ? ?/sec      1.09    110.8±2.61µs        ? ?/sec
sized_commands_512_bytes/4000_commands                   1.08   224.9±23.23µs        ? ?/sec      1.00   208.8±17.81µs        ? ?/sec      1.08   226.1±12.77µs        ? ?/sec
sized_commands_512_bytes/6000_commands                   1.08   344.3±44.37µs        ? ?/sec      1.00   318.5±36.79µs        ? ?/sec      1.09   347.7±45.95µs        ? ?/sec
sized_commands_512_bytes/8000_commands                   1.09   471.0±74.12µs        ? ?/sec      1.00   431.1±60.04µs        ? ?/sec      1.08   466.3±58.84µs        ? ?/sec
sparse_fragmented_iter/base                              1.08     11.7±1.03ns        ? ?/sec      1.00     10.8±0.60ns        ? ?/sec      1.07     11.6±0.57ns        ? ?/sec
sparse_fragmented_iter/foreach                           1.00      9.0±0.32ns        ? ?/sec      1.01      9.1±0.61ns        ? ?/sec      1.03      9.2±0.24ns        ? ?/sec
sparse_fragmented_iter/foreach_wide                      1.00     42.1±1.16ns        ? ?/sec      1.01     42.5±5.03ns        ? ?/sec      1.06     44.5±5.25ns        ? ?/sec
sparse_fragmented_iter/wide                              1.00     72.8±5.68ns        ? ?/sec      1.00     72.7±3.58ns        ? ?/sec      1.18     85.9±5.85ns        ? ?/sec
spawn_commands/2000_entities                             1.09   275.0±38.92µs        ? ?/sec      1.00   253.0±17.34µs        ? ?/sec      1.02   259.1±22.15µs        ? ?/sec
spawn_commands/4000_entities                             1.02   518.1±42.46µs        ? ?/sec      1.00   507.5±33.03µs        ? ?/sec      1.02   518.6±30.19µs        ? ?/sec
spawn_commands/6000_entities                             1.03   777.7±83.67µs        ? ?/sec      1.03   774.4±80.24µs        ? ?/sec      1.00   753.6±54.53µs        ? ?/sec
spawn_commands/8000_entities                             1.02  1003.0±97.98µs        ? ?/sec      1.00   983.7±74.44µs        ? ?/sec      1.01   996.2±96.49µs        ? ?/sec
world_entity/50000_entities                              1.00    425.2±2.51µs        ? ?/sec      1.00    424.8±0.65µs        ? ?/sec      1.01    427.5±0.97µs        ? ?/sec
world_get/50000_entities_sparse                          1.00    562.1±9.75µs        ? ?/sec      1.03   581.6±19.74µs        ? ?/sec      1.03    578.7±5.37µs        ? ?/sec
world_get/50000_entities_table                           1.00   911.2±15.13µs        ? ?/sec      1.00    912.7±7.54µs        ? ?/sec      1.03    943.0±8.83µs        ? ?/sec
world_query_for_each/50000_entities_sparse               1.00     84.5±0.66µs        ? ?/sec      1.01     84.6±1.09µs        ? ?/sec      1.00     84.1±2.73µs        ? ?/sec
world_query_for_each/50000_entities_table                1.00     27.2±0.14µs        ? ?/sec      1.00     27.2±0.13µs        ? ?/sec      1.00     27.2±0.13µs        ? ?/sec
world_query_get/50000_entities_sparse                    1.00    464.8±8.55µs        ? ?/sec      1.00    464.5±3.51µs        ? ?/sec      1.04    483.1±4.63µs        ? ?/sec
world_query_get/50000_entities_sparse_wide               1.00  1420.2±53.48µs        ? ?/sec      1.01  1437.3±37.08µs        ? ?/sec      1.07  1518.6±28.91µs        ? ?/sec
world_query_get/50000_entities_table                     1.00    402.2±3.20µs        ? ?/sec      1.00    403.1±9.38µs        ? ?/sec      1.09    437.5±4.37µs        ? ?/sec
world_query_get/50000_entities_table_wide                1.01    811.5±6.02µs        ? ?/sec      1.01    812.6±5.39µs        ? ?/sec      1.00    804.2±8.17µs        ? ?/sec
world_query_iter/50000_entities_sparse                   1.02     97.9±8.14µs        ? ?/sec      1.00     96.1±1.48µs        ? ?/sec      1.00     96.6±1.41µs        ? ?/sec
world_query_iter/50000_entities_table                    1.00     27.2±0.45µs        ? ?/sec      1.00     27.2±0.26µs        ? ?/sec      1.01     27.5±0.42µs        ? ?/sec

james7132 avatar Jun 29 '22 06:06 james7132

One more thing I noticed is that this PR, regardless of which version (reference or value), substantially closes the gap between Query::iter and Query::for_each on the fragmented iteration cases. This does seem to suggest that there are just a few extra blockers before the two are effectively equivalent, which would make #4060 viable to merge.

james7132 avatar Jun 29 '22 07:06 james7132

Seeing how eagerly fetching entities/rows speeds up iteration, I noticed the same pattern when using set_archetype to fetch the table: we're passing in a &Tables reference that each Fetch is individually requerying for the same table. Changed set_archetype to take a Table instead of Tables, and predictably, both Query::get and fragmented iteration both saw another jump in perf.

Microbenchmark Results
group                                                    cleanup-fetch                            main                                     more-cleanup
-----                                                    -------------                            ----                                     ------------
add_remove_component/sparse_set                          1.02  1322.0±76.98µs        ? ?/sec      1.00  1301.0±82.37µs        ? ?/sec      1.01  1311.7±76.75µs        ? ?/sec
add_remove_component/table                               1.03  1682.7±49.97µs        ? ?/sec      1.00  1629.8±35.87µs        ? ?/sec      1.04  1688.4±48.99µs        ? ?/sec
add_remove_component_big/sparse_set                      1.00  1435.7±299.23µs        ? ?/sec     1.03  1476.2±296.28µs        ? ?/sec     1.03  1472.5±317.71µs        ? ?/sec
add_remove_component_big/table                           1.01      2.9±0.05ms        ? ?/sec      1.00      2.9±0.23ms        ? ?/sec      1.00      2.9±0.19ms        ? ?/sec
added_archetypes/archetype_count/100                     1.00   186.2±10.35µs        ? ?/sec      1.00    185.6±9.29µs        ? ?/sec      1.00    185.7±9.51µs        ? ?/sec
added_archetypes/archetype_count/1000                    1.00   688.2±20.12µs        ? ?/sec      1.06   728.0±50.14µs        ? ?/sec      1.01   692.8±10.88µs        ? ?/sec
added_archetypes/archetype_count/10000                   1.01     14.2±1.39ms        ? ?/sec      1.04     14.6±2.00ms        ? ?/sec      1.00     14.0±1.53ms        ? ?/sec
added_archetypes/archetype_count/200                     1.03   234.2±10.67µs        ? ?/sec      1.00   226.6±12.16µs        ? ?/sec      1.01   228.8±11.26µs        ? ?/sec
added_archetypes/archetype_count/2000                    1.01  1355.2±33.30µs        ? ?/sec      1.09  1465.4±125.18µs        ? ?/sec     1.00  1344.3±31.27µs        ? ?/sec
added_archetypes/archetype_count/500                     1.01   403.4±48.12µs        ? ?/sec      1.03   413.3±29.36µs        ? ?/sec      1.00   400.8±26.20µs        ? ?/sec
added_archetypes/archetype_count/5000                    1.09      4.8±0.56ms        ? ?/sec      1.25      5.5±0.83ms        ? ?/sec      1.00      4.4±0.50ms        ? ?/sec
busy_systems/01x_entities_03_systems                     1.16     40.4±2.18µs        ? ?/sec      1.00     34.9±1.19µs        ? ?/sec      1.01     35.2±3.01µs        ? ?/sec
busy_systems/01x_entities_06_systems                     1.22     78.1±2.53µs        ? ?/sec      1.04     66.6±2.78µs        ? ?/sec      1.00     64.1±2.56µs        ? ?/sec
busy_systems/01x_entities_09_systems                     1.26    118.6±4.26µs        ? ?/sec      1.00     94.0±2.86µs        ? ?/sec      1.04     97.8±2.52µs        ? ?/sec
busy_systems/01x_entities_12_systems                     1.20    147.7±4.45µs        ? ?/sec      1.00    123.1±5.81µs        ? ?/sec      1.02    125.9±5.54µs        ? ?/sec
busy_systems/01x_entities_15_systems                     1.16    178.7±4.28µs        ? ?/sec      1.00    153.6±4.88µs        ? ?/sec      1.03    158.3±7.83µs        ? ?/sec
busy_systems/02x_entities_03_systems                     1.26     75.5±4.10µs        ? ?/sec      1.00     59.9±2.85µs        ? ?/sec      1.02     61.3±3.69µs        ? ?/sec
busy_systems/02x_entities_06_systems                     1.17    137.2±3.85µs        ? ?/sec      1.02    119.0±7.39µs        ? ?/sec      1.00    116.8±5.85µs        ? ?/sec
busy_systems/02x_entities_09_systems                     1.27    217.6±6.59µs        ? ?/sec      1.00    171.9±4.08µs        ? ?/sec      1.03    176.5±4.26µs        ? ?/sec
busy_systems/02x_entities_12_systems                     1.17    270.3±6.97µs        ? ?/sec      1.00    230.6±8.73µs        ? ?/sec      1.00    230.4±7.77µs        ? ?/sec
busy_systems/02x_entities_15_systems                     1.20   334.5±11.35µs        ? ?/sec      1.00    278.8±6.34µs        ? ?/sec      1.03    287.7±7.33µs        ? ?/sec
busy_systems/03x_entities_03_systems                     1.13    102.5±4.59µs        ? ?/sec      1.10    100.3±4.70µs        ? ?/sec      1.00     90.9±5.51µs        ? ?/sec
busy_systems/03x_entities_06_systems                     1.18    201.1±8.64µs        ? ?/sec      1.02    174.8±5.38µs        ? ?/sec      1.00    170.7±6.05µs        ? ?/sec
busy_systems/03x_entities_09_systems                     1.32   323.3±13.82µs        ? ?/sec      1.01    247.9±6.90µs        ? ?/sec      1.00    245.4±9.12µs        ? ?/sec
busy_systems/03x_entities_12_systems                     1.21   389.5±14.12µs        ? ?/sec      1.00    320.9±9.25µs        ? ?/sec      1.01   324.5±11.89µs        ? ?/sec
busy_systems/03x_entities_15_systems                     1.18   482.4±12.17µs        ? ?/sec      1.00    407.1±8.99µs        ? ?/sec      1.01   410.3±16.19µs        ? ?/sec
busy_systems/04x_entities_03_systems                     1.25    138.9±7.21µs        ? ?/sec      1.00    111.1±4.86µs        ? ?/sec      1.02    112.8±8.70µs        ? ?/sec
busy_systems/04x_entities_06_systems                     1.22   273.1±12.83µs        ? ?/sec      1.00    223.3±6.85µs        ? ?/sec      1.01   226.2±10.57µs        ? ?/sec
busy_systems/04x_entities_09_systems                     1.28   416.1±12.86µs        ? ?/sec      1.03   336.6±12.67µs        ? ?/sec      1.00   326.1±21.51µs        ? ?/sec
busy_systems/04x_entities_12_systems                     1.24   513.7±16.52µs        ? ?/sec      1.02   421.3±12.56µs        ? ?/sec      1.00   414.8±10.25µs        ? ?/sec
busy_systems/04x_entities_15_systems                     1.17   615.1±26.75µs        ? ?/sec      1.01   532.2±15.00µs        ? ?/sec      1.00   525.6±15.36µs        ? ?/sec
busy_systems/05x_entities_03_systems                     1.32   176.5±10.36µs        ? ?/sec      1.00    133.6±4.76µs        ? ?/sec      1.04    138.4±5.93µs        ? ?/sec
busy_systems/05x_entities_06_systems                     1.40   366.8±14.17µs        ? ?/sec      1.00    262.7±8.66µs        ? ?/sec      1.05   275.5±14.39µs        ? ?/sec
busy_systems/05x_entities_09_systems                     1.27   514.7±20.64µs        ? ?/sec      1.01   410.9±12.45µs        ? ?/sec      1.00   406.5±12.22µs        ? ?/sec
busy_systems/05x_entities_12_systems                     1.20   659.0±22.17µs        ? ?/sec      1.00   547.0±13.86µs        ? ?/sec      1.01   550.9±17.99µs        ? ?/sec
busy_systems/05x_entities_15_systems                     1.27   838.4±30.31µs        ? ?/sec      1.00   659.7±18.72µs        ? ?/sec      1.05   693.9±49.48µs        ? ?/sec
contrived/01x_entities_03_systems                        1.21     27.2±0.52µs        ? ?/sec      1.00     22.5±1.62µs        ? ?/sec      1.00     22.5±1.44µs        ? ?/sec
contrived/01x_entities_06_systems                        1.25     52.8±1.32µs        ? ?/sec      1.00     42.4±2.02µs        ? ?/sec      1.03     43.7±1.75µs        ? ?/sec
contrived/01x_entities_09_systems                        1.21     75.1±2.46µs        ? ?/sec      1.00     62.0±2.42µs        ? ?/sec      1.01     62.4±2.97µs        ? ?/sec
contrived/01x_entities_12_systems                        1.21     98.9±1.88µs        ? ?/sec      1.00     81.6±4.21µs        ? ?/sec      1.01     82.5±5.39µs        ? ?/sec
contrived/01x_entities_15_systems                        1.25    125.8±3.28µs        ? ?/sec      1.00    100.6±6.48µs        ? ?/sec      1.05    106.0±6.78µs        ? ?/sec
contrived/02x_entities_03_systems                        1.37     45.4±1.72µs        ? ?/sec      1.00     33.1±2.56µs        ? ?/sec      1.09     36.1±1.85µs        ? ?/sec
contrived/02x_entities_06_systems                        1.21     79.0±1.47µs        ? ?/sec      1.00     65.2±4.73µs        ? ?/sec      1.05     68.2±2.51µs        ? ?/sec
contrived/02x_entities_09_systems                        1.22    116.3±2.60µs        ? ?/sec      1.00     95.1±4.00µs        ? ?/sec      1.02     97.4±4.48µs        ? ?/sec
contrived/02x_entities_12_systems                        1.22    152.8±4.98µs        ? ?/sec      1.00    125.2±3.68µs        ? ?/sec      1.02    128.2±3.54µs        ? ?/sec
contrived/02x_entities_15_systems                        1.25    186.0±3.24µs        ? ?/sec      1.01    150.7±7.38µs        ? ?/sec      1.00    149.1±6.81µs        ? ?/sec
contrived/03x_entities_03_systems                        1.31     54.8±1.61µs        ? ?/sec      1.00     41.9±2.10µs        ? ?/sec      1.05     44.2±2.75µs        ? ?/sec
contrived/03x_entities_06_systems                        1.28    105.3±2.34µs        ? ?/sec      1.00     82.0±2.44µs        ? ?/sec      1.00     82.3±3.33µs        ? ?/sec
contrived/03x_entities_09_systems                        1.25    157.9±3.23µs        ? ?/sec      1.00    126.7±5.21µs        ? ?/sec      1.01    128.4±6.95µs        ? ?/sec
contrived/03x_entities_12_systems                        1.17    196.6±5.02µs        ? ?/sec      1.00    167.6±5.79µs        ? ?/sec      1.02    171.2±6.28µs        ? ?/sec
contrived/03x_entities_15_systems                        1.13    238.8±5.80µs        ? ?/sec      1.00   212.2±10.71µs        ? ?/sec      1.03    219.6±6.77µs        ? ?/sec
contrived/04x_entities_03_systems                        1.39     73.1±2.13µs        ? ?/sec      1.00     52.7±2.80µs        ? ?/sec      1.24     65.3±3.78µs        ? ?/sec
contrived/04x_entities_06_systems                        1.31    133.8±3.18µs        ? ?/sec      1.10    111.8±8.56µs        ? ?/sec      1.00    101.9±3.49µs        ? ?/sec
contrived/04x_entities_09_systems                        1.23    189.2±5.33µs        ? ?/sec      1.00    154.0±4.34µs        ? ?/sec      1.01    154.8±5.08µs        ? ?/sec
contrived/04x_entities_12_systems                        1.18    241.6±5.61µs        ? ?/sec      1.01    206.2±8.28µs        ? ?/sec      1.00    204.4±6.92µs        ? ?/sec
contrived/04x_entities_15_systems                        1.12    295.4±7.79µs        ? ?/sec      1.00    262.9±9.39µs        ? ?/sec      1.04    272.1±9.51µs        ? ?/sec
contrived/05x_entities_03_systems                        1.38     84.0±2.42µs        ? ?/sec      1.00     60.9±1.98µs        ? ?/sec      1.18     71.8±4.78µs        ? ?/sec
contrived/05x_entities_06_systems                        1.32    159.1±3.13µs        ? ?/sec      1.00    120.1±2.42µs        ? ?/sec      1.11    132.9±6.89µs        ? ?/sec
contrived/05x_entities_09_systems                        1.32    241.3±5.63µs        ? ?/sec      1.00    182.5±5.08µs        ? ?/sec      1.12   205.1±11.71µs        ? ?/sec
contrived/05x_entities_12_systems                        1.20   299.3±12.07µs        ? ?/sec      1.00   248.8±10.79µs        ? ?/sec      1.06    263.8±6.77µs        ? ?/sec
contrived/05x_entities_15_systems                        1.18    362.3±8.39µs        ? ?/sec      1.00   306.5±16.78µs        ? ?/sec      1.06   324.8±13.37µs        ? ?/se
fragmented_iter/base                                     1.00   352.2±10.92ns        ? ?/sec      1.34   470.4±31.26ns        ? ?/sec      1.00   350.6±12.09ns        ? ?/sec
fragmented_iter/foreach                                  1.08   245.8±26.37ns        ? ?/sec      1.07   242.8±23.09ns        ? ?/sec      1.00   226.7±18.01ns        ? ?/sec
fragmented_iter/foreach_wide                             1.01      4.0±0.23µs        ? ?/sec      1.03      4.1±0.54µs        ? ?/sec      1.00      4.0±0.23µs        ? ?/sec
fragmented_iter/wide                                     1.00      4.5±0.19µs        ? ?/sec      1.34      6.1±0.21µs        ? ?/sec      1.18      5.3±0.15µs        ? ?/sec
query_get/50000_entities_sparse                          1.00   643.1±45.04µs        ? ?/sec      1.99  1280.2±34.58µs        ? ?/sec      1.04   670.6±11.62µs        ? ?/sec
query_get/50000_entities_table                           1.26   577.9±10.12µs        ? ?/sec      1.60   737.7±12.53µs        ? ?/sec      1.00   460.1±34.59µs        ? ?/sec
query_get_component/50000_entities_sparse                1.06  1225.8±95.93µs        ? ?/sec      1.04  1209.1±65.14µs        ? ?/sec      1.00  1159.8±29.57µs        ? ?/sec
query_get_component/50000_entities_table                 1.04  1244.6±112.45µs        ? ?/sec     1.00  1192.1±17.73µs        ? ?/sec      1.00  1195.1±34.28µs        ? ?/sec
simple_iter/base                                         1.00     11.0±0.06µs        ? ?/sec      1.25     13.7±0.10µs        ? ?/sec      1.00     10.9±0.07µs        ? ?/sec
simple_iter/foreach                                      1.01     10.9±0.06µs        ? ?/sec      1.00     10.8±0.10µs        ? ?/sec      1.01     10.9±0.04µs        ? ?/sec
simple_iter/foreach_wide                                 1.00     44.2±0.66µs        ? ?/sec      1.10     48.4±4.06µs        ? ?/sec      1.00     44.1±0.38µs        ? ?/sec
simple_iter/sparse                                       1.00     47.8±0.44µs        ? ?/sec      1.15     54.8±0.58µs        ? ?/sec      1.00     47.6±1.33µs        ? ?/sec
simple_iter/sparse_foreach                               1.03     43.7±2.89µs        ? ?/sec      1.16     49.1±0.33µs        ? ?/sec      1.00     42.3±0.45µs        ? ?/sec
simple_iter/sparse_foreach_wide                          1.06    241.9±3.14µs        ? ?/sec      1.15    262.1±4.61µs        ? ?/sec      1.00    228.9±2.71µs        ? ?/sec
simple_iter/sparse_wide                                  1.04    252.6±2.16µs        ? ?/sec      1.16    281.6±4.89µs        ? ?/sec      1.00    243.6±2.95µs        ? ?/sec
simple_iter/system                                       1.00     11.0±0.13µs        ? ?/sec      1.25     13.7±0.15µs        ? ?/sec      1.00     11.0±0.06µs        ? ?/sec
simple_iter/wide                                         1.09     55.5±0.22µs        ? ?/sec      1.26     64.3±0.52µs        ? ?/sec      1.00     51.0±1.06µs        ? ?/sec
sized_commands_0_bytes/2000_commands                     1.13      5.1±0.08µs        ? ?/sec      1.00      4.5±0.03µs        ? ?/sec      1.24      5.6±0.09µs        ? ?/sec
sized_commands_0_bytes/4000_commands                     1.11     10.2±0.09µs        ? ?/sec      1.00      9.1±0.06µs        ? ?/sec      1.23     11.2±0.07µs        ? ?/sec
sized_commands_0_bytes/6000_commands                     1.13     15.4±0.20µs        ? ?/sec      1.00     13.7±0.11µs        ? ?/sec      1.24     16.9±0.63µs        ? ?/sec
sized_commands_0_bytes/8000_commands                     1.12     20.4±0.28µs        ? ?/sec      1.00     18.3±0.22µs        ? ?/sec      1.23     22.5±0.10µs        ? ?/sec
sized_commands_12_bytes/2000_commands                    1.00      7.1±0.06µs        ? ?/sec      1.00      7.1±0.04µs        ? ?/sec      1.01      7.2±0.02µs        ? ?/sec
sized_commands_12_bytes/4000_commands                    1.01     14.4±0.07µs        ? ?/sec      1.02     14.6±0.23µs        ? ?/sec      1.00     14.3±0.09µs        ? ?/sec
sized_commands_12_bytes/6000_commands                    1.01     22.1±0.33µs        ? ?/sec      1.00     21.8±0.31µs        ? ?/sec      1.02     22.2±0.27µs        ? ?/sec
sized_commands_12_bytes/8000_commands                    1.00     28.9±0.40µs        ? ?/sec      1.00     28.9±0.44µs        ? ?/sec      1.00     28.9±0.16µs        ? ?/sec
sized_commands_512_bytes/2000_commands                   1.06    108.9±2.50µs        ? ?/sec      1.08    110.8±2.61µs        ? ?/sec      1.00    102.7±2.75µs        ? ?/sec
sized_commands_512_bytes/4000_commands                   1.07   224.9±23.23µs        ? ?/sec      1.07   226.1±12.77µs        ? ?/sec      1.00   210.4±15.70µs        ? ?/sec
sized_commands_512_bytes/6000_commands                   1.07   344.3±44.37µs        ? ?/sec      1.08   347.7±45.95µs        ? ?/sec      1.00   322.5±36.53µs        ? ?/sec
sized_commands_512_bytes/8000_commands                   1.08   471.0±74.12µs        ? ?/sec      1.07   466.3±58.84µs        ? ?/sec      1.00   436.6±70.12µs        ? ?/sec
sparse_fragmented_iter/base                              1.16     11.7±1.03ns        ? ?/sec      1.14     11.6±0.57ns        ? ?/sec      1.00     10.2±0.52ns        ? ?/sec
sparse_fragmented_iter/foreach                           1.03      9.0±0.32ns        ? ?/sec      1.05      9.2±0.24ns        ? ?/sec      1.00      8.7±0.32ns        ? ?/sec
sparse_fragmented_iter/foreach_wide                      1.00     42.1±1.16ns        ? ?/sec      1.06     44.5±5.25ns        ? ?/sec      1.06    44.7±17.05ns        ? ?/sec
sparse_fragmented_iter/wide                              1.12     72.8±5.68ns        ? ?/sec      1.32     85.9±5.85ns        ? ?/sec      1.00     65.1±4.36ns        ? ?/sec
world_entity/50000_entities                              1.00    425.2±2.51µs        ? ?/sec      1.01    427.5±0.97µs        ? ?/sec      1.00    424.3±0.68µs        ? ?/sec
world_get/50000_entities_sparse                          1.02    562.1±9.75µs        ? ?/sec      1.05    578.7±5.37µs        ? ?/sec      1.00   550.8±18.57µs        ? ?/sec
world_get/50000_entities_table                           1.05   911.2±15.13µs        ? ?/sec      1.08    943.0±8.83µs        ? ?/sec      1.00    869.4±2.27µs        ? ?/sec
world_query_for_each/50000_entities_sparse               1.00     84.5±0.66µs        ? ?/sec      1.00     84.1±2.73µs        ? ?/sec      1.14     96.1±2.58µs        ? ?/sec
world_query_for_each/50000_entities_table                1.00     27.2±0.14µs        ? ?/sec      1.00     27.2±0.13µs        ? ?/sec      1.00     27.2±0.12µs        ? ?/sec
world_query_get/50000_entities_sparse                    1.00    464.8±8.55µs        ? ?/sec      1.04    483.1±4.63µs        ? ?/sec      1.01   468.4±23.38µs        ? ?/sec
world_query_get/50000_entities_sparse_wide               1.06  1420.2±53.48µs        ? ?/sec      1.14  1518.6±28.91µs        ? ?/sec      1.00  1336.0±13.60µs        ? ?/sec
world_query_get/50000_entities_table                     1.54    402.2±3.20µs        ? ?/sec      1.68    437.5±4.37µs        ? ?/sec      1.00    260.9±7.64µs        ? ?/sec
world_query_get/50000_entities_table_wide                1.09    811.5±6.02µs        ? ?/sec      1.08    804.2±8.17µs        ? ?/sec      1.00   745.8±33.71µs        ? ?/sec
world_query_iter/50000_entities_sparse                   1.02     97.9±8.14µs        ? ?/sec      1.01     96.6±1.41µs        ? ?/sec      1.00     95.8±0.83µs        ? ?/sec
world_query_iter/50000_entities_table                    1.00     27.2±0.45µs        ? ?/sec      1.01     27.5±0.42µs        ? ?/sec      1.00     27.3±0.81µs        ? ?/sec

james7132 avatar Jul 02 '22 10:07 james7132

I've given these changes a pretty thorough review (and I'm on board). I just merged #5205, so if you adapt to the flatter format / resolves conflicts I'll merge this in short order.

cart avatar Aug 04 '22 21:08 cart

Another sanity check microbenchmark to ensure nothing since the rebase has affected the perf changes seen earlier:

group                                                    fetch-cleanup                            main
-----                                                    -------------                            ----
add_remove/sparse_set                                    1.08  1268.9±87.23µs        ? ?/sec      1.00  1176.5±66.08µs        ? ?/sec
add_remove/table                                         1.06  1579.0±11.37µs        ? ?/sec      1.00   1488.3±9.01µs        ? ?/sec
add_remove_big/sparse_set                                1.00  1454.4±304.11µs        ? ?/sec     1.00  1454.3±254.56µs        ? ?/sec
add_remove_big/table                                     1.00      2.8±0.05ms        ? ?/sec      1.02      2.8±0.03ms        ? ?/sec
added_archetypes/archetype_count/100                     1.00    133.0±5.77µs        ? ?/sec      1.08    144.1±6.04µs        ? ?/sec
added_archetypes/archetype_count/1000                    1.00   685.6±30.33µs        ? ?/sec      1.07   737.0±54.32µs        ? ?/sec
added_archetypes/archetype_count/10000                   1.00     11.1±0.99ms        ? ?/sec      1.09     12.0±1.47ms        ? ?/sec
added_archetypes/archetype_count/200                     1.00    202.5±8.23µs        ? ?/sec      1.04    210.4±8.46µs        ? ?/sec
added_archetypes/archetype_count/2000                    1.00  1301.8±13.39µs        ? ?/sec      1.04  1353.8±38.92µs        ? ?/sec
added_archetypes/archetype_count/500                     1.00   414.2±19.10µs        ? ?/sec      1.08   446.7±46.44µs        ? ?/sec
added_archetypes/archetype_count/5000                    1.00      3.6±0.23ms        ? ?/sec      1.03      3.7±0.29ms        ? ?/sec
busy_systems/01x_entities_03_systems                     1.00     32.3±1.66µs        ? ?/sec      1.03     33.3±1.11µs        ? ?/sec
busy_systems/01x_entities_06_systems                     1.00     63.9±1.63µs        ? ?/sec      1.20     76.9±5.06µs        ? ?/sec
busy_systems/01x_entities_09_systems                     1.00     94.7±2.91µs        ? ?/sec      1.02     96.8±2.30µs        ? ?/sec
busy_systems/01x_entities_12_systems                     1.00    128.1±6.51µs        ? ?/sec      1.00    128.2±4.45µs        ? ?/sec
busy_systems/01x_entities_15_systems                     1.00   165.3±11.14µs        ? ?/sec      1.02    168.6±5.11µs        ? ?/sec
busy_systems/02x_entities_03_systems                     1.01     62.6±2.75µs        ? ?/sec      1.00     61.9±3.22µs        ? ?/sec
busy_systems/02x_entities_06_systems                     1.00    110.7±2.89µs        ? ?/sec      1.08    119.2±5.08µs        ? ?/sec
busy_systems/02x_entities_09_systems                     1.00    172.9±8.28µs        ? ?/sec      1.13   196.1±13.30µs        ? ?/sec
busy_systems/02x_entities_12_systems                     1.00   233.6±13.24µs        ? ?/sec      1.04    241.8±6.45µs        ? ?/sec
busy_systems/02x_entities_15_systems                     1.00    292.1±7.97µs        ? ?/sec      1.14   333.2±28.27µs        ? ?/sec
busy_systems/03x_entities_03_systems                     1.00     85.3±6.08µs        ? ?/sec      1.10     93.5±5.50µs        ? ?/sec
busy_systems/03x_entities_06_systems                     1.08   188.8±20.73µs        ? ?/sec      1.00    175.1±8.34µs        ? ?/sec
busy_systems/03x_entities_09_systems                     1.00    244.2±8.30µs        ? ?/sec      1.04    255.1±6.97µs        ? ?/sec
busy_systems/03x_entities_12_systems                     1.00    316.7±6.50µs        ? ?/sec      1.14   361.2±23.45µs        ? ?/sec
busy_systems/03x_entities_15_systems                     1.00    390.8±9.45µs        ? ?/sec      1.09   427.5±24.81µs        ? ?/sec
busy_systems/04x_entities_03_systems                     1.00    103.6±2.70µs        ? ?/sec      1.07    111.0±6.01µs        ? ?/sec
busy_systems/04x_entities_06_systems                     1.05    217.9±8.40µs        ? ?/sec      1.00    206.7±3.61µs        ? ?/sec
busy_systems/04x_entities_09_systems                     1.00    322.9±8.09µs        ? ?/sec      1.02   328.4±13.50µs        ? ?/sec
busy_systems/04x_entities_12_systems                     1.00   421.1±10.97µs        ? ?/sec      1.04   437.5±16.77µs        ? ?/sec
busy_systems/04x_entities_15_systems                     1.00   538.2±26.41µs        ? ?/sec      1.03   554.1±17.64µs        ? ?/sec
busy_systems/05x_entities_03_systems                     1.00    152.0±5.04µs        ? ?/sec      1.01   153.1±13.03µs        ? ?/sec
busy_systems/05x_entities_06_systems                     1.06   313.4±13.09µs        ? ?/sec      1.00   296.0±14.11µs        ? ?/sec
busy_systems/05x_entities_09_systems                     1.16   501.8±26.77µs        ? ?/sec      1.00   430.8±15.88µs        ? ?/sec
busy_systems/05x_entities_12_systems                     1.15   638.7±34.35µs        ? ?/sec      1.00   554.3±14.60µs        ? ?/sec
busy_systems/05x_entities_15_systems                     1.08   751.6±19.56µs        ? ?/sec      1.00   696.6±30.10µs        ? ?/sec
contrived/01x_entities_03_systems                        1.12     20.9±0.76µs        ? ?/sec      1.00     18.6±1.04µs        ? ?/sec
contrived/01x_entities_06_systems                        1.00     36.9±1.71µs        ? ?/sec      1.12     41.1±3.16µs        ? ?/sec
contrived/01x_entities_09_systems                        1.00     57.7±2.12µs        ? ?/sec      1.07     61.8±4.89µs        ? ?/sec
contrived/01x_entities_12_systems                        1.00     77.7±3.41µs        ? ?/sec      1.00     77.9±4.83µs        ? ?/sec
contrived/01x_entities_15_systems                        1.03     95.8±4.90µs        ? ?/sec      1.00     93.1±6.13µs        ? ?/sec
contrived/02x_entities_03_systems                        1.01     37.6±2.77µs        ? ?/sec      1.00     37.4±3.59µs        ? ?/sec
contrived/02x_entities_06_systems                        1.01     65.4±3.02µs        ? ?/sec      1.00     64.9±4.52µs        ? ?/sec
contrived/02x_entities_09_systems                        1.00     90.4±2.87µs        ? ?/sec      1.05     94.8±5.57µs        ? ?/sec
contrived/02x_entities_12_systems                        1.07    120.9±7.75µs        ? ?/sec      1.00    112.8±1.95µs        ? ?/sec
contrived/02x_entities_15_systems                        1.00    149.8±9.17µs        ? ?/sec      1.03    153.8±9.61µs        ? ?/sec
contrived/03x_entities_03_systems                        1.00     41.6±2.99µs        ? ?/sec      1.12     46.6±4.03µs        ? ?/sec
contrived/03x_entities_06_systems                        1.06     93.9±8.80µs        ? ?/sec      1.00     88.5±5.52µs        ? ?/sec
contrived/03x_entities_09_systems                        1.00    119.9±5.72µs        ? ?/sec      1.06    127.5±5.72µs        ? ?/sec
contrived/03x_entities_12_systems                        1.02    175.7±9.57µs        ? ?/sec      1.00    172.0±8.75µs        ? ?/sec
contrived/03x_entities_15_systems                        1.00    194.6±7.28µs        ? ?/sec      1.10   213.2±11.33µs        ? ?/sec
contrived/04x_entities_03_systems                        1.00     49.3±1.82µs        ? ?/sec      1.11     54.8±3.03µs        ? ?/sec
contrived/04x_entities_06_systems                        1.00    106.0±9.04µs        ? ?/sec      1.02    107.8±5.70µs        ? ?/sec
contrived/04x_entities_09_systems                        1.00    154.2±9.32µs        ? ?/sec      1.06   162.9±10.30µs        ? ?/sec
contrived/04x_entities_12_systems                        1.00    201.2±9.26µs        ? ?/sec      1.05   212.1±10.46µs        ? ?/sec
contrived/04x_entities_15_systems                        1.00    249.2±9.01µs        ? ?/sec      1.07   266.0±13.07µs        ? ?/sec
contrived/05x_entities_03_systems                        1.00     59.9±1.77µs        ? ?/sec      1.11     66.8±5.35µs        ? ?/sec
contrived/05x_entities_06_systems                        1.00   130.6±11.41µs        ? ?/sec      1.08   141.2±12.92µs        ? ?/sec
contrived/05x_entities_09_systems                        1.00   194.0±14.56µs        ? ?/sec      1.08   208.7±19.26µs        ? ?/sec
contrived/05x_entities_12_systems                        1.00   252.7±16.08µs        ? ?/sec      1.01    256.3±9.52µs        ? ?/sec
contrived/05x_entities_15_systems                        1.00   306.5±14.65µs        ? ?/sec      1.08   330.4±16.47µs        ? ?/sec
get_or_spawn/batched                                     1.00   411.2±20.19µs        ? ?/sec      1.01   415.6±19.61µs        ? ?/sec
get_or_spawn/individual                                  1.01   925.7±72.21µs        ? ?/sec      1.00   915.3±89.30µs        ? ?/sec
heavy_compute/base                                       1.03    360.7±3.97µs        ? ?/sec      1.00    351.3±1.89µs        ? ?/sec
insert_commands/insert                                   1.00   802.9±34.72µs        ? ?/sec      1.04   832.4±76.71µs        ? ?/sec
insert_commands/insert_batch                             1.02   416.6±46.70µs        ? ?/sec      1.00   407.8±18.01µs        ? ?/sec
insert_simple/base                                       1.00    554.1±1.94µs        ? ?/sec      1.03    570.7±3.33µs        ? ?/sec
insert_simple/unbatched                                  1.00  1207.9±31.83µs        ? ?/sec      1.03  1242.6±16.31µs        ? ?/sec
iter_fragmented/base                                     1.00    344.4±5.64ns        ? ?/sec      1.39   477.6±25.28ns        ? ?/sec
iter_fragmented/foreach                                  1.01   246.2±26.31ns        ? ?/sec      1.00   243.4±23.85ns        ? ?/sec
iter_fragmented/foreach_wide                             1.01      3.9±0.24µs        ? ?/sec      1.00      3.9±0.10µs        ? ?/sec
iter_fragmented/wide                                     1.00      4.5±0.22µs        ? ?/sec      1.16      5.3±0.15µs        ? ?/sec
iter_fragmented_sparse/base                              1.00     10.6±0.60ns        ? ?/sec      1.09     11.6±0.98ns        ? ?/sec
iter_fragmented_sparse/foreach                           1.00      9.0±0.23ns        ? ?/sec      1.15     10.3±0.68ns        ? ?/sec
iter_fragmented_sparse/foreach_wide                      1.00     43.1±7.11ns        ? ?/sec      1.04    45.1±10.53ns        ? ?/sec
iter_fragmented_sparse/wide                              1.00    55.3±16.08ns        ? ?/sec      1.21     66.9±0.51ns        ? ?/sec
iter_simple/base                                         1.00     11.0±0.05µs        ? ?/sec      1.25     13.7±0.11µs        ? ?/sec
iter_simple/foreach                                      1.01     10.9±0.03µs        ? ?/sec      1.00     10.8±0.04µs        ? ?/sec
iter_simple/foreach_sparse_set                           1.00     42.4±0.24µs        ? ?/sec      1.13     47.9±0.21µs        ? ?/sec
iter_simple/foreach_wide                                 1.00     45.7±1.21µs        ? ?/sec      1.09     50.0±2.51µs        ? ?/sec
iter_simple/foreach_wide_sparse_set                      1.00    230.9±1.44µs        ? ?/sec      1.14    264.0±1.42µs        ? ?/sec
iter_simple/sparse_set                                   1.00     49.4±0.16µs        ? ?/sec      1.12     55.1±0.25µs        ? ?/sec
iter_simple/system                                       1.00     11.0±0.02µs        ? ?/sec      1.24     13.6±0.04µs        ? ?/sec
iter_simple/wide                                         1.00     59.8±0.67µs        ? ?/sec      1.12     66.9±0.50µs        ? ?/sec
iter_simple/wide_sparse_set                              1.00    232.9±0.87µs        ? ?/sec      1.19    277.0±1.00µs        ? ?/sec
query_get/50000_entities_sparse                          1.00   717.5±19.04µs        ? ?/sec      2.01  1440.5±31.43µs        ? ?/sec
query_get/50000_entities_table                           1.00    491.9±4.48µs        ? ?/sec      1.51   741.8±25.19µs        ? ?/sec
query_get_component/50000_entities_sparse                1.00  1163.5±24.56µs        ? ?/sec      1.01  1174.2±32.50µs        ? ?/sec
query_get_component/50000_entities_table                 1.02  1087.6±11.38µs        ? ?/sec      1.00  1069.5±24.82µs        ? ?/sec
query_get_component_simple/system                        1.00    752.6±4.44µs        ? ?/sec      1.04    786.1±6.48µs        ? ?/sec
query_get_component_simple/unchecked                     1.00    974.7±9.31µs        ? ?/sec      1.03  1000.7±51.16µs        ? ?/sec
run_criteria/no/001_systems                              1.00     93.4±0.45ns        ? ?/sec      1.02     95.0±0.25ns        ? ?/sec
run_criteria/no/006_systems                              1.03    175.1±1.06ns        ? ?/sec      1.00    169.6±0.82ns        ? ?/sec
run_criteria/no/011_systems                              1.02    259.5±0.67ns        ? ?/sec      1.00    254.8±1.08ns        ? ?/sec
run_criteria/no/016_systems                              1.02    337.8±0.84ns        ? ?/sec      1.00    331.6±1.39ns        ? ?/sec
run_criteria/no/021_systems                              1.05    429.1±2.19ns        ? ?/sec      1.00    410.2±1.76ns        ? ?/sec
run_criteria/no/026_systems                              1.03    506.0±2.04ns        ? ?/sec      1.00    488.9±2.74ns        ? ?/sec
run_criteria/no/031_systems                              1.06    605.8±3.28ns        ? ?/sec      1.00    573.4±2.03ns        ? ?/sec
run_criteria/no/036_systems                              1.03    708.6±2.56ns        ? ?/sec      1.00    685.7±1.56ns        ? ?/sec
run_criteria/no/041_systems                              1.03    785.3±4.47ns        ? ?/sec      1.00    764.5±1.32ns        ? ?/sec
run_criteria/no/046_systems                              1.07   927.1±10.20ns        ? ?/sec      1.00    868.2±2.36ns        ? ?/sec
run_criteria/no/051_systems                              1.06  1018.8±10.58ns        ? ?/sec      1.00    962.7±1.84ns        ? ?/sec
run_criteria/no/056_systems                              1.10   1141.4±6.27ns        ? ?/sec      1.00   1040.1±5.40ns        ? ?/sec
run_criteria/no/061_systems                              1.10   1259.0±8.06ns        ? ?/sec      1.00   1144.3±1.38ns        ? ?/sec
run_criteria/no/066_systems                              1.08  1337.4±11.57ns        ? ?/sec      1.00   1237.5±3.75ns        ? ?/sec
run_criteria/no/071_systems                              1.00  1412.3±11.45ns        ? ?/sec      1.00   1409.2±3.30ns        ? ?/sec
run_criteria/no/076_systems                              1.04  1490.5±15.69ns        ? ?/sec      1.00   1426.7±2.75ns        ? ?/sec
run_criteria/no/081_systems                              1.03  1572.9±11.18ns        ? ?/sec      1.00  1524.3±15.76ns        ? ?/sec
run_criteria/no/086_systems                              1.02  1645.0±15.40ns        ? ?/sec      1.00   1608.3±2.82ns        ? ?/sec
run_criteria/no/091_systems                              1.05  1766.6±26.71ns        ? ?/sec      1.00   1690.2±2.33ns        ? ?/sec
run_criteria/no/096_systems                              1.03  1824.9±24.77ns        ? ?/sec      1.00   1773.4±3.61ns        ? ?/sec
run_criteria/no/101_systems                              1.06  1958.2±18.92ns        ? ?/sec      1.00   1847.7±6.74ns        ? ?/sec
run_criteria/no_with_labels/001_systems                  1.00     91.0±0.49ns        ? ?/sec      1.00     91.1±0.19ns        ? ?/sec
run_criteria/no_with_labels/006_systems                  1.09    161.3±1.04ns        ? ?/sec      1.00    148.6±1.28ns        ? ?/sec
run_criteria/no_with_labels/011_systems                  1.09    227.6±2.47ns        ? ?/sec      1.00    208.2±1.64ns        ? ?/sec
run_criteria/no_with_labels/016_systems                  1.07    282.4±1.93ns        ? ?/sec      1.00    264.5±0.97ns        ? ?/sec
run_criteria/no_with_labels/021_systems                  1.09    344.6±1.79ns        ? ?/sec      1.00    317.4±1.06ns        ? ?/sec
run_criteria/no_with_labels/026_systems                  1.09    410.8±3.93ns        ? ?/sec      1.00    378.5±2.32ns        ? ?/sec
run_criteria/no_with_labels/031_systems                  1.07    475.1±4.42ns        ? ?/sec      1.00    444.0±3.90ns        ? ?/sec
run_criteria/no_with_labels/036_systems                  1.10    558.2±2.91ns        ? ?/sec      1.00    506.1±1.93ns        ? ?/sec
run_criteria/no_with_labels/041_systems                  1.07    600.8±2.73ns        ? ?/sec      1.00    561.3±1.96ns        ? ?/sec
run_criteria/no_with_labels/046_systems                  1.08    667.6±9.71ns        ? ?/sec      1.00    619.7±6.54ns        ? ?/sec
run_criteria/no_with_labels/051_systems                  1.08    728.5±6.24ns        ? ?/sec      1.00    671.9±6.71ns        ? ?/sec
run_criteria/no_with_labels/056_systems                  1.11    804.1±7.67ns        ? ?/sec      1.00    727.5±3.71ns        ? ?/sec
run_criteria/no_with_labels/061_systems                  1.11    871.1±8.84ns        ? ?/sec      1.00    786.9±2.64ns        ? ?/sec
run_criteria/no_with_labels/066_systems                  1.08    925.2±5.77ns        ? ?/sec      1.00    860.1±2.39ns        ? ?/sec
run_criteria/no_with_labels/071_systems                  1.07   995.8±12.16ns        ? ?/sec      1.00    930.2±6.50ns        ? ?/sec
run_criteria/no_with_labels/076_systems                  1.07   1057.9±6.51ns        ? ?/sec      1.00    986.2±9.76ns        ? ?/sec
run_criteria/no_with_labels/081_systems                  1.11   1156.6±7.87ns        ? ?/sec      1.00  1046.6±23.81ns        ? ?/sec
run_criteria/no_with_labels/086_systems                  1.11   1219.4±8.10ns        ? ?/sec      1.00   1103.3±6.99ns        ? ?/sec
run_criteria/no_with_labels/091_systems                  1.11   1279.2±4.54ns        ? ?/sec      1.00   1148.5±5.92ns        ? ?/sec
run_criteria/no_with_labels/096_systems                  1.12  1344.0±30.91ns        ? ?/sec      1.00   1198.0±4.55ns        ? ?/sec
run_criteria/no_with_labels/101_systems                  1.09  1391.9±65.04ns        ? ?/sec      1.00   1273.2±4.81ns        ? ?/sec
run_criteria/yes/001_systems                             1.00      4.8±0.11µs        ? ?/sec      1.08      5.2±0.04µs        ? ?/sec
run_criteria/yes/006_systems                             1.00      9.1±0.10µs        ? ?/sec      1.12     10.2±0.14µs        ? ?/sec
run_criteria/yes/011_systems                             1.00     13.5±1.22µs        ? ?/sec      1.07     14.4±0.86µs        ? ?/sec
run_criteria/yes/016_systems                             1.00     17.8±0.90µs        ? ?/sec      1.05     18.7±1.22µs        ? ?/sec
run_criteria/yes/021_systems                             1.00     20.8±1.32µs        ? ?/sec      1.10     23.0±1.46µs        ? ?/sec
run_criteria/yes/026_systems                             1.00     24.1±1.78µs        ? ?/sec      1.07     25.9±1.27µs        ? ?/sec
run_criteria/yes/031_systems                             1.00     26.7±1.39µs        ? ?/sec      1.11     29.6±1.57µs        ? ?/sec
run_criteria/yes/036_systems                             1.00     30.1±1.85µs        ? ?/sec      1.04     31.4±1.79µs        ? ?/sec
run_criteria/yes/041_systems                             1.00     34.3±1.31µs        ? ?/sec      1.04     35.8±2.16µs        ? ?/sec
run_criteria/yes/046_systems                             1.00     36.9±1.45µs        ? ?/sec      1.07     39.5±2.97µs        ? ?/sec
run_criteria/yes/051_systems                             1.00     40.4±2.12µs        ? ?/sec      1.08     43.7±1.72µs        ? ?/sec
run_criteria/yes/056_systems                             1.00     43.3±1.46µs        ? ?/sec      1.07     46.2±1.67µs        ? ?/sec
run_criteria/yes/061_systems                             1.00     46.5±2.16µs        ? ?/sec      1.02     47.5±2.39µs        ? ?/sec
run_criteria/yes/066_systems                             1.00     48.4±2.68µs        ? ?/sec      1.08     52.3±2.00µs        ? ?/sec
run_criteria/yes/071_systems                             1.00     53.6±4.02µs        ? ?/sec      1.04     56.0±1.99µs        ? ?/sec
run_criteria/yes/076_systems                             1.00     56.1±2.53µs        ? ?/sec      1.06     59.4±2.35µs        ? ?/sec
run_criteria/yes/081_systems                             1.00     60.1±3.47µs        ? ?/sec      1.03     62.0±2.33µs        ? ?/sec
run_criteria/yes/086_systems                             1.00     63.1±3.02µs        ? ?/sec      1.07     67.8±2.47µs        ? ?/sec
run_criteria/yes/091_systems                             1.00     67.4±2.30µs        ? ?/sec      1.09     73.3±2.98µs        ? ?/sec
run_criteria/yes/096_systems                             1.00     74.3±3.02µs        ? ?/sec      1.06     79.0±2.76µs        ? ?/sec
run_criteria/yes/101_systems                             1.00     79.2±3.62µs        ? ?/sec      1.08     85.2±3.33µs        ? ?/sec
run_criteria/yes_using_query/001_systems                 1.00      4.7±0.12µs        ? ?/sec      1.03      4.8±0.27µs        ? ?/sec
run_criteria/yes_using_query/006_systems                 1.00      8.9±0.15µs        ? ?/sec      1.13     10.1±0.15µs        ? ?/sec
run_criteria/yes_using_query/011_systems                 1.00     13.6±0.56µs        ? ?/sec      1.11     15.1±0.44µs        ? ?/sec
run_criteria/yes_using_query/016_systems                 1.00     17.8±0.65µs        ? ?/sec      1.10     19.5±1.13µs        ? ?/sec
run_criteria/yes_using_query/021_systems                 1.00     21.6±1.19µs        ? ?/sec      1.10     23.7±1.54µs        ? ?/sec
run_criteria/yes_using_query/026_systems                 1.00     24.8±1.14µs        ? ?/sec      1.07     26.5±1.95µs        ? ?/sec
run_criteria/yes_using_query/031_systems                 1.00     28.2±1.16µs        ? ?/sec      1.05     29.5±1.74µs        ? ?/sec
run_criteria/yes_using_query/036_systems                 1.00     31.6±1.48µs        ? ?/sec      1.06     33.5±2.08µs        ? ?/sec
run_criteria/yes_using_query/041_systems                 1.00     34.1±1.46µs        ? ?/sec      1.03     35.3±2.08µs        ? ?/sec
run_criteria/yes_using_query/046_systems                 1.00     38.2±1.81µs        ? ?/sec      1.05     40.2±2.76µs        ? ?/sec
run_criteria/yes_using_query/051_systems                 1.00     41.0±1.54µs        ? ?/sec      1.00     41.1±2.83µs        ? ?/sec
run_criteria/yes_using_query/056_systems                 1.00     44.1±1.77µs        ? ?/sec      1.02     45.1±2.23µs        ? ?/sec
run_criteria/yes_using_query/061_systems                 1.00     47.3±2.22µs        ? ?/sec      1.02     48.1±2.82µs        ? ?/sec
run_criteria/yes_using_query/066_systems                 1.00     49.6±2.23µs        ? ?/sec      1.03     51.1±2.02µs        ? ?/sec
run_criteria/yes_using_query/071_systems                 1.04     55.9±2.61µs        ? ?/sec      1.00     53.6±2.70µs        ? ?/sec
run_criteria/yes_using_query/076_systems                 1.00     58.1±2.40µs        ? ?/sec      1.00     58.4±3.93µs        ? ?/sec
run_criteria/yes_using_query/081_systems                 1.00     64.6±2.78µs        ? ?/sec      1.03     66.3±2.45µs        ? ?/sec
run_criteria/yes_using_query/086_systems                 1.00     67.7±3.38µs        ? ?/sec      1.04     70.4±2.84µs        ? ?/sec
run_criteria/yes_using_query/091_systems                 1.00     70.8±2.75µs        ? ?/sec      1.07     75.6±4.41µs        ? ?/sec
run_criteria/yes_using_query/096_systems                 1.00     76.5±3.67µs        ? ?/sec      1.05     80.0±4.03µs        ? ?/sec
run_criteria/yes_using_query/101_systems                 1.00     82.8±2.71µs        ? ?/sec      1.06     87.4±3.20µs        ? ?/sec
run_criteria/yes_using_resource/001_systems              1.00      4.7±0.12µs        ? ?/sec      1.01      4.7±0.24µs        ? ?/sec
run_criteria/yes_using_resource/006_systems              1.00      9.0±0.27µs        ? ?/sec      1.09      9.8±0.19µs        ? ?/sec
run_criteria/yes_using_resource/011_systems              1.00     14.0±0.50µs        ? ?/sec      1.07     15.0±0.76µs        ? ?/sec
run_criteria/yes_using_resource/016_systems              1.00     18.7±1.30µs        ? ?/sec      1.02     19.0±1.09µs        ? ?/sec
run_criteria/yes_using_resource/021_systems              1.00     21.7±1.23µs        ? ?/sec      1.05     22.7±1.29µs        ? ?/sec
run_criteria/yes_using_resource/026_systems              1.00     26.3±1.81µs        ? ?/sec      1.01     26.7±1.00µs        ? ?/sec
run_criteria/yes_using_resource/031_systems              1.00     27.8±1.44µs        ? ?/sec      1.07     29.8±1.15µs        ? ?/sec
run_criteria/yes_using_resource/036_systems              1.00     31.2±2.00µs        ? ?/sec      1.06     33.1±1.41µs        ? ?/sec
run_criteria/yes_using_resource/041_systems              1.00     34.6±2.49µs        ? ?/sec      1.09     37.5±1.72µs        ? ?/sec
run_criteria/yes_using_resource/046_systems              1.00     37.1±1.88µs        ? ?/sec      1.10     40.8±1.43µs        ? ?/sec
run_criteria/yes_using_resource/051_systems              1.00     41.3±2.98µs        ? ?/sec      1.10     45.4±2.14µs        ? ?/sec
run_criteria/yes_using_resource/056_systems              1.00     44.6±3.48µs        ? ?/sec      1.11     49.6±2.84µs        ? ?/sec
run_criteria/yes_using_resource/061_systems              1.00     47.4±3.13µs        ? ?/sec      1.09     51.8±2.21µs        ? ?/sec
run_criteria/yes_using_resource/066_systems              1.00     50.5±2.81µs        ? ?/sec      1.11     56.3±3.21µs        ? ?/sec
run_criteria/yes_using_resource/071_systems              1.00     53.2±2.89µs        ? ?/sec      1.12     59.6±2.23µs        ? ?/sec
run_criteria/yes_using_resource/076_systems              1.00     55.7±2.77µs        ? ?/sec      1.11     61.6±2.58µs        ? ?/sec
run_criteria/yes_using_resource/081_systems              1.00     57.0±2.42µs        ? ?/sec      1.11     63.6±2.15µs        ? ?/sec
run_criteria/yes_using_resource/086_systems              1.00     66.9±3.37µs        ? ?/sec      1.02     68.3±3.61µs        ? ?/sec
run_criteria/yes_using_resource/091_systems              1.00     70.2±3.03µs        ? ?/sec      1.06     74.1±3.17µs        ? ?/sec
run_criteria/yes_using_resource/096_systems              1.00     75.6±4.54µs        ? ?/sec      1.09     82.4±4.85µs        ? ?/sec
run_criteria/yes_using_resource/101_systems              1.00     79.0±3.44µs        ? ?/sec      1.14     90.2±3.63µs        ? ?/sec
sized_commands_0_bytes/2000_commands                     1.00      4.5±0.03µs        ? ?/sec      1.12      5.1±0.01µs        ? ?/sec
sized_commands_0_bytes/4000_commands                     1.00      9.1±0.04µs        ? ?/sec      1.11     10.2±0.04µs        ? ?/sec
sized_commands_0_bytes/6000_commands                     1.00     13.7±0.07µs        ? ?/sec      1.12     15.3±0.05µs        ? ?/sec
sized_commands_0_bytes/8000_commands                     1.00     18.3±0.08µs        ? ?/sec      1.11     20.3±0.06µs        ? ?/sec
sized_commands_12_bytes/2000_commands                    1.00      7.1±0.05µs        ? ?/sec      1.03      7.3±0.02µs        ? ?/sec
sized_commands_12_bytes/4000_commands                    1.00     14.5±0.10µs        ? ?/sec      1.01     14.6±0.05µs        ? ?/sec
sized_commands_12_bytes/6000_commands                    1.00     21.9±0.18µs        ? ?/sec      1.01     22.1±0.09µs        ? ?/sec
sized_commands_12_bytes/8000_commands                    1.00     29.3±0.24µs        ? ?/sec      1.00     29.4±0.17µs        ? ?/sec
sized_commands_512_bytes/2000_commands                   1.00    103.0±3.57µs        ? ?/sec      1.08    110.7±3.50µs        ? ?/sec
sized_commands_512_bytes/4000_commands                   1.00   212.1±16.07µs        ? ?/sec      1.06   224.7±14.64µs        ? ?/sec
sized_commands_512_bytes/6000_commands                   1.00   321.8±35.43µs        ? ?/sec      1.07   345.2±38.65µs        ? ?/sec
sized_commands_512_bytes/8000_commands                   1.00   434.8±61.40µs        ? ?/sec      1.07   466.5±67.14µs        ? ?/sec
spawn_commands/2000_entities                             1.00    227.5±9.97µs        ? ?/sec      1.03    234.2±6.67µs        ? ?/sec
spawn_commands/4000_entities                             1.00   457.1±18.53µs        ? ?/sec      1.03   472.1±12.59µs        ? ?/sec
spawn_commands/6000_entities                             1.00   713.0±24.84µs        ? ?/sec      1.01   721.5±27.41µs        ? ?/sec
spawn_commands/8000_entities                             1.00   918.9±30.62µs        ? ?/sec      1.02   939.5±24.01µs        ? ?/sec
spawn_world/10000_entities                               1.00  1218.2±92.80µs        ? ?/sec      1.00  1215.1±78.94µs        ? ?/sec
spawn_world/1000_entities                                1.00    121.7±9.06µs        ? ?/sec      1.01    122.4±8.85µs        ? ?/sec
spawn_world/100_entities                                 1.00     12.3±0.89µs        ? ?/sec      1.00     12.3±0.91µs        ? ?/sec
spawn_world/10_entities                                  1.00  1211.1±96.87ns        ? ?/sec      1.01  1218.9±98.09ns        ? ?/sec
spawn_world/1_entities                                   1.01    123.1±9.41ns        ? ?/sec      1.00    122.1±8.85ns        ? ?/sec
world_entity/50000_entities                              1.00    424.7±1.31µs        ? ?/sec      1.00    424.2±0.90µs        ? ?/sec
world_get/50000_entities_sparse                          1.03    580.2±3.38µs        ? ?/sec      1.00    561.1±2.13µs        ? ?/sec
world_get/50000_entities_table                           1.02   938.6±13.42µs        ? ?/sec      1.00    916.6±6.70µs        ? ?/sec
world_query_for_each/50000_entities_sparse               1.01     84.6±0.64µs        ? ?/sec      1.00     83.6±0.23µs        ? ?/sec
world_query_for_each/50000_entities_table                1.01     27.3±0.11µs        ? ?/sec      1.00     27.1±0.05µs        ? ?/sec
world_query_get/50000_entities_sparse                    1.00    457.7±4.91µs        ? ?/sec      1.02    464.6±1.64µs        ? ?/sec
world_query_get/50000_entities_sparse_wide               1.00   1413.0±8.89µs        ? ?/sec      1.09  1536.3±69.11µs        ? ?/sec
world_query_get/50000_entities_table                     1.00    260.7±1.41µs        ? ?/sec      1.62    422.1±1.06µs        ? ?/sec
world_query_get/50000_entities_table_wide                1.00    806.9±3.47µs        ? ?/sec      1.05   844.3±22.03µs        ? ?/sec
world_query_iter/50000_entities_sparse                   1.02     97.5±0.60µs        ? ?/sec      1.00     95.6±1.67µs        ? ?/sec
world_query_iter/50000_entities_table                    1.01     27.4±0.09µs        ? ?/sec      1.00     27.1±0.07µs        ? ?/sec

james7132 avatar Oct 25 '22 01:10 james7132

My trait query benchmarks look promising when I update to this branch.

All<> - 1 match 66.371 µs +5.2565%
All<> - 2 matches 99.637 µs -6.4798%
All<> - 1-2 matches 85.095 µs +4.4459%
One<> 28.772 µs -11.916%
One<> - filtering 15.342 µs -10.749%

joseph-gio avatar Oct 25 '22 11:10 joseph-gio

bors r+

cart avatar Oct 28 '22 09:10 cart