bevy
bevy copied to clipboard
Clean up Fetch code
Objective
Clean up code surrounding fetch by pulling out the common parts into the iteration code.
Solution
Merge Fetch::table_fetch and Fetch::archetype_fetch into a single API: Fetch::fetch(&mut self, entity: &Entity, table_row: &usize). This provides everything any fetch requires to internally decide which storage to read from and get the underlying data. All of these functions are marked as #[inline(always)] and the arguments are passed as references to attempt to optimize out the argument that isn't being used.
External to Fetch, Query iteration has been changed to keep track of the table row and entity outside of fetch, which moves a lot of the expensive bookkeeping Fetch structs had previously done internally into the outer loop.
~~TODO: Benchmark, docs~~ Done.
Changelog
Changed: Fetch::table_fetch and Fetch::archetype_fetch have been merged into a single Fetch::fetch function.
Migration Guide
TODO
Did a quick round of benchmarks. Generally looks to be unchanged, though there are some regressions, particularly with Query::iter. I have an idea on how to address it. The main change to sparse iteration is the addition of two slice::get_unchecked calls versus the one before. It may be better to collocate the Entity and table indexes together to get better cache behavior when hitting this section.
The other thing to point out is the giant speedup in Query::get performance, which makes sense since the FetchState::set_archetype calls for sparse components is effectively a no-op now. A similarly significant, but relatively smaller speedup can be seen in the table benchmark for it as well. We should verify this with additional tests, as something like this should positively affect engine systems like transform propagation (assuming that's not dominated by memory bandwidth).
Bar the aforementioned regressions, assuming the benchmarks here are consistent, this seems like a workable change.
group fetch-cleanup main
----- ------------- ----
busy_systems/01x_entities_03_systems 1.09 36.3±1.46µs ? ?/sec 1.00 33.4±1.57µs ? ?/sec
busy_systems/01x_entities_06_systems 1.05 69.6±3.26µs ? ?/sec 1.00 66.5±3.70µs ? ?/sec
busy_systems/01x_entities_09_systems 1.03 101.5±6.33µs ? ?/sec 1.00 98.9±3.30µs ? ?/sec
busy_systems/01x_entities_12_systems 1.07 133.5±5.44µs ? ?/sec 1.00 124.7±6.52µs ? ?/sec
busy_systems/01x_entities_15_systems 1.04 161.6±7.41µs ? ?/sec 1.00 155.4±5.34µs ? ?/sec
busy_systems/02x_entities_03_systems 1.00 60.6±2.70µs ? ?/sec 1.05 63.4±2.43µs ? ?/sec
busy_systems/02x_entities_06_systems 1.00 117.1±7.05µs ? ?/sec 1.06 124.4±6.40µs ? ?/sec
busy_systems/02x_entities_09_systems 1.00 173.6±10.64µs ? ?/sec 1.04 180.7±7.01µs ? ?/sec
busy_systems/02x_entities_12_systems 1.00 220.4±10.73µs ? ?/sec 1.10 242.1±11.81µs ? ?/sec
busy_systems/02x_entities_15_systems 1.00 275.5±9.13µs ? ?/sec 1.06 292.4±10.19µs ? ?/sec
busy_systems/03x_entities_03_systems 1.06 91.4±5.00µs ? ?/sec 1.00 86.3±5.73µs ? ?/sec
busy_systems/03x_entities_06_systems 1.04 175.5±8.39µs ? ?/sec 1.00 169.0±8.28µs ? ?/sec
busy_systems/03x_entities_09_systems 1.04 250.4±8.50µs ? ?/sec 1.00 239.8±11.20µs ? ?/sec
busy_systems/03x_entities_12_systems 1.00 320.7±9.24µs ? ?/sec 1.03 329.2±16.04µs ? ?/sec
busy_systems/03x_entities_15_systems 1.00 407.7±11.33µs ? ?/sec 1.00 408.0±12.54µs ? ?/sec
busy_systems/04x_entities_03_systems 1.04 115.8±5.11µs ? ?/sec 1.00 111.7±5.41µs ? ?/sec
busy_systems/04x_entities_06_systems 1.00 211.5±7.87µs ? ?/sec 1.03 218.6±10.46µs ? ?/sec
busy_systems/04x_entities_09_systems 1.00 329.6±19.01µs ? ?/sec 1.00 330.4±16.70µs ? ?/sec
busy_systems/04x_entities_12_systems 1.03 428.7±11.81µs ? ?/sec 1.00 416.4±14.26µs ? ?/sec
busy_systems/04x_entities_15_systems 1.01 529.4±12.77µs ? ?/sec 1.00 525.4±16.45µs ? ?/sec
busy_systems/05x_entities_03_systems 1.00 136.4±5.72µs ? ?/sec 1.15 156.7±15.90µs ? ?/sec
busy_systems/05x_entities_06_systems 1.00 266.8±12.08µs ? ?/sec 1.07 286.1±11.79µs ? ?/sec
busy_systems/05x_entities_09_systems 1.00 402.7±13.92µs ? ?/sec 1.06 426.6±20.73µs ? ?/sec
busy_systems/05x_entities_12_systems 1.00 533.6±21.50µs ? ?/sec 1.05 560.0±17.33µs ? ?/sec
busy_systems/05x_entities_15_systems 1.00 674.8±27.99µs ? ?/sec 1.04 700.7±44.73µs ? ?/sec
contrived/01x_entities_03_systems 1.30 27.3±2.90µs ? ?/sec 1.00 21.1±1.24µs ? ?/sec
contrived/01x_entities_06_systems 1.03 42.8±3.04µs ? ?/sec 1.00 41.5±1.58µs ? ?/sec
contrived/01x_entities_09_systems 1.01 61.4±4.00µs ? ?/sec 1.00 60.9±3.89µs ? ?/sec
contrived/01x_entities_12_systems 1.01 81.2±4.80µs ? ?/sec 1.00 80.7±3.33µs ? ?/sec
contrived/01x_entities_15_systems 1.00 98.1±5.48µs ? ?/sec 1.02 99.9±5.96µs ? ?/sec
contrived/02x_entities_03_systems 1.08 33.9±2.63µs ? ?/sec 1.00 31.5±1.46µs ? ?/sec
contrived/02x_entities_06_systems 1.00 60.6±2.18µs ? ?/sec 1.05 63.4±2.95µs ? ?/sec
contrived/02x_entities_09_systems 1.00 92.0±5.89µs ? ?/sec 1.00 91.6±2.73µs ? ?/sec
contrived/02x_entities_12_systems 1.06 128.5±11.27µs ? ?/sec 1.00 121.3±3.12µs ? ?/sec
contrived/02x_entities_15_systems 1.00 151.2±7.89µs ? ?/sec 1.02 153.5±7.67µs ? ?/sec
contrived/03x_entities_03_systems 1.02 43.9±2.27µs ? ?/sec 1.00 43.2±1.46µs ? ?/sec
contrived/03x_entities_06_systems 1.05 86.7±5.83µs ? ?/sec 1.00 82.3±4.09µs ? ?/sec
contrived/03x_entities_09_systems 1.01 125.5±8.38µs ? ?/sec 1.00 124.7±5.31µs ? ?/sec
contrived/03x_entities_12_systems 1.00 160.4±3.97µs ? ?/sec 1.02 164.0±4.07µs ? ?/sec
contrived/03x_entities_15_systems 1.03 208.9±12.52µs ? ?/sec 1.00 202.8±8.56µs ? ?/sec
contrived/04x_entities_03_systems 1.02 54.5±2.76µs ? ?/sec 1.00 53.1±4.04µs ? ?/sec
contrived/04x_entities_06_systems 1.03 106.4±6.15µs ? ?/sec 1.00 103.6±5.81µs ? ?/sec
contrived/04x_entities_09_systems 1.03 160.1±10.74µs ? ?/sec 1.00 154.7±5.95µs ? ?/sec
contrived/04x_entities_12_systems 1.00 205.8±9.71µs ? ?/sec 1.00 205.2±8.46µs ? ?/sec
contrived/04x_entities_15_systems 1.01 251.5±9.87µs ? ?/sec 1.00 248.4±6.81µs ? ?/sec
contrived/05x_entities_03_systems 1.00 62.7±3.15µs ? ?/sec 1.01 63.4±3.52µs ? ?/sec
contrived/05x_entities_06_systems 1.00 126.9±5.49µs ? ?/sec 1.00 127.2±6.37µs ? ?/sec
contrived/05x_entities_09_systems 1.00 179.5±5.15µs ? ?/sec 1.05 187.5±6.80µs ? ?/sec
contrived/05x_entities_12_systems 1.00 244.4±7.21µs ? ?/sec 1.04 254.6±15.85µs ? ?/sec
contrived/05x_entities_15_systems 1.01 312.3±9.83µs ? ?/sec 1.00 310.5±10.72µs ? ?/sec
fragmented_iter/base 1.17 478.1±5.89ns ? ?/sec 1.00 410.1±18.16ns ? ?/sec
fragmented_iter/foreach 1.00 236.3±25.69ns ? ?/sec 1.02 241.6±29.92ns ? ?/sec
heavy_compute/base 1.00 355.8±4.53µs ? ?/sec 1.02 364.5±5.74µs ? ?/sec
query_get/50000_entities_sparse 1.00 589.0±31.37µs ? ?/sec 1.91 1127.2±55.93µs ? ?/sec
query_get/50000_entities_table 1.00 457.1±27.07µs ? ?/sec 1.32 601.6±12.68µs ? ?/sec
query_get_component/50000_entities_sparse 1.00 1244.8±50.69µs ? ?/sec 1.03 1287.5±52.23µs ? ?/sec
query_get_component/50000_entities_table 1.01 1247.2±108.41µs ? ?/sec 1.00 1236.8±21.18µs ? ?/sec
simple_iter/base 1.01 13.9±0.74µs ? ?/sec 1.00 13.7±0.17µs ? ?/sec
simple_iter/foreach 1.00 11.6±0.12µs ? ?/sec 1.00 11.6±0.18µs ? ?/sec
simple_iter/sparse 1.00 52.0±0.22µs ? ?/sec 1.18 61.3±0.32µs ? ?/sec
simple_iter/sparse_foreach 1.00 45.2±0.19µs ? ?/sec 1.12 50.4±0.76µs ? ?/sec
simple_iter/system 1.00 13.7±0.29µs ? ?/sec 1.01 13.8±0.49µs ? ?/sec
sparse_fragmented_iter/base 1.00 10.9±0.24ns ? ?/sec 1.18 12.8±0.86ns ? ?/sec
sparse_fragmented_iter/foreach 1.00 8.9±0.22ns ? ?/sec 1.00 8.9±0.14ns ? ?/sec
world_query_for_each/50000_entities_sparse 1.03 99.0±1.47µs ? ?/sec 1.00 95.8±0.91µs ? ?/sec
world_query_for_each/50000_entities_table 1.00 27.2±0.24µs ? ?/sec 1.00 27.2±0.10µs ? ?/sec
world_query_get/50000_entities_sparse 1.20 478.6±11.07µs ? ?/sec 1.00 398.2±10.82µs ? ?/sec
world_query_get/50000_entities_table 1.00 274.3±4.77µs ? ?/sec 1.00 273.4±4.26µs ? ?/sec
world_query_iter/50000_entities_sparse 1.12 114.9±0.65µs ? ?/sec 1.00 102.8±3.31µs ? ?/sec
world_query_iter/50000_entities_table 1.00 27.3±0.78µs ? ?/sec 1.00 27.2±0.26µs ? ?/sec
I'd consider taking those changes to the performance characteristics as is. Query::get is in the hot path for a lot of things too, and those are awesome improvements.
That said, I'm excited to see how your mitigation ideas work.
Attempted to merge the entities and rows into one Vec to make it easier for sparse iteration. It seems to address the sparse iteration issues.
group fetch-cleanup fetch-cleanup-with-archetype-entity main
----- ------------- ----------------------------------- ----
busy_systems/01x_entities_03_systems 1.09 36.3±1.46µs ? ?/sec 1.14 38.0±1.67µs ? ?/sec 1.00 33.4±1.57µs ? ?/sec
busy_systems/01x_entities_06_systems 1.05 69.6±3.26µs ? ?/sec 1.17 78.0±5.66µs ? ?/sec 1.00 66.5±3.70µs ? ?/sec
busy_systems/01x_entities_09_systems 1.03 101.5±6.33µs ? ?/sec 1.10 108.4±5.60µs ? ?/sec 1.00 98.9±3.30µs ? ?/sec
busy_systems/01x_entities_12_systems 1.07 133.5±5.44µs ? ?/sec 1.16 145.0±10.70µs ? ?/sec 1.00 124.7±6.52µs ? ?/sec
busy_systems/01x_entities_15_systems 1.04 161.6±7.41µs ? ?/sec 1.17 181.1±11.18µs ? ?/sec 1.00 155.4±5.34µs ? ?/sec
busy_systems/02x_entities_03_systems 1.03 60.6±2.70µs ? ?/sec 1.00 59.1±2.39µs ? ?/sec 1.07 63.4±2.43µs ? ?/sec
busy_systems/02x_entities_06_systems 1.00 117.1±7.05µs ? ?/sec 1.01 117.8±6.11µs ? ?/sec 1.06 124.4±6.40µs ? ?/sec
busy_systems/02x_entities_09_systems 1.00 173.6±10.64µs ? ?/sec 1.00 173.9±7.70µs ? ?/sec 1.04 180.7±7.01µs ? ?/sec
busy_systems/02x_entities_12_systems 1.00 220.4±10.73µs ? ?/sec 1.05 230.5±11.22µs ? ?/sec 1.10 242.1±11.81µs ? ?/sec
busy_systems/02x_entities_15_systems 1.00 275.5±9.13µs ? ?/sec 1.03 284.8±15.50µs ? ?/sec 1.06 292.4±10.19µs ? ?/sec
busy_systems/03x_entities_03_systems 1.09 91.4±5.00µs ? ?/sec 1.00 83.7±3.08µs ? ?/sec 1.03 86.3±5.73µs ? ?/sec
busy_systems/03x_entities_06_systems 1.08 175.5±8.39µs ? ?/sec 1.00 162.6±5.64µs ? ?/sec 1.04 169.0±8.28µs ? ?/sec
busy_systems/03x_entities_09_systems 1.04 250.4±8.50µs ? ?/sec 1.02 244.2±9.76µs ? ?/sec 1.00 239.8±11.20µs ? ?/sec
busy_systems/03x_entities_12_systems 1.00 320.7±9.24µs ? ?/sec 1.02 327.5±18.62µs ? ?/sec 1.03 329.2±16.04µs ? ?/sec
busy_systems/03x_entities_15_systems 1.01 407.7±11.33µs ? ?/sec 1.00 404.9±16.51µs ? ?/sec 1.01 408.0±12.54µs ? ?/sec
busy_systems/04x_entities_03_systems 1.04 115.8±5.11µs ? ?/sec 1.06 118.9±11.55µs ? ?/sec 1.00 111.7±5.41µs ? ?/sec
busy_systems/04x_entities_06_systems 1.00 211.5±7.87µs ? ?/sec 1.02 214.7±12.11µs ? ?/sec 1.03 218.6±10.46µs ? ?/sec
busy_systems/04x_entities_09_systems 1.04 329.6±19.01µs ? ?/sec 1.00 317.0±20.76µs ? ?/sec 1.04 330.4±16.70µs ? ?/sec
busy_systems/04x_entities_12_systems 1.03 428.7±11.81µs ? ?/sec 1.02 425.0±15.03µs ? ?/sec 1.00 416.4±14.26µs ? ?/sec
busy_systems/04x_entities_15_systems 1.01 529.4±12.77µs ? ?/sec 1.00 527.9±20.09µs ? ?/sec 1.00 525.4±16.45µs ? ?/sec
busy_systems/05x_entities_03_systems 1.02 136.4±5.72µs ? ?/sec 1.00 133.3±4.60µs ? ?/sec 1.18 156.7±15.90µs ? ?/sec
busy_systems/05x_entities_06_systems 1.00 266.8±12.08µs ? ?/sec 1.03 273.8±14.66µs ? ?/sec 1.07 286.1±11.79µs ? ?/sec
busy_systems/05x_entities_09_systems 1.03 402.7±13.92µs ? ?/sec 1.00 391.8±14.55µs ? ?/sec 1.09 426.6±20.73µs ? ?/sec
busy_systems/05x_entities_12_systems 1.01 533.6±21.50µs ? ?/sec 1.00 528.2±27.11µs ? ?/sec 1.06 560.0±17.33µs ? ?/sec
busy_systems/05x_entities_15_systems 1.02 674.8±27.99µs ? ?/sec 1.00 664.1±34.89µs ? ?/sec 1.06 700.7±44.73µs ? ?/sec
contrived/01x_entities_03_systems 1.30 27.3±2.90µs ? ?/sec 1.17 24.6±2.05µs ? ?/sec 1.00 21.1±1.24µs ? ?/sec
contrived/01x_entities_06_systems 1.03 42.8±3.04µs ? ?/sec 1.11 46.1±4.04µs ? ?/sec 1.00 41.5±1.58µs ? ?/sec
contrived/01x_entities_09_systems 1.01 61.4±4.00µs ? ?/sec 1.06 64.3±4.45µs ? ?/sec 1.00 60.9±3.89µs ? ?/sec
contrived/01x_entities_12_systems 1.01 81.2±4.80µs ? ?/sec 1.06 85.3±6.08µs ? ?/sec 1.00 80.7±3.33µs ? ?/sec
contrived/01x_entities_15_systems 1.00 98.1±5.48µs ? ?/sec 1.11 108.7±7.74µs ? ?/sec 1.02 99.9±5.96µs ? ?/sec
contrived/02x_entities_03_systems 1.08 33.9±2.63µs ? ?/sec 1.14 35.9±3.57µs ? ?/sec 1.00 31.5±1.46µs ? ?/sec
contrived/02x_entities_06_systems 1.00 60.6±2.18µs ? ?/sec 1.06 64.0±3.42µs ? ?/sec 1.05 63.4±2.95µs ? ?/sec
contrived/02x_entities_09_systems 1.00 92.0±5.89µs ? ?/sec 1.04 95.0±2.91µs ? ?/sec 1.00 91.6±2.73µs ? ?/sec
contrived/02x_entities_12_systems 1.06 128.5±11.27µs ? ?/sec 1.03 124.6±8.30µs ? ?/sec 1.00 121.3±3.12µs ? ?/sec
contrived/02x_entities_15_systems 1.00 151.2±7.89µs ? ?/sec 1.01 153.3±9.82µs ? ?/sec 1.02 153.5±7.67µs ? ?/sec
contrived/03x_entities_03_systems 1.04 43.9±2.27µs ? ?/sec 1.00 42.0±2.54µs ? ?/sec 1.03 43.2±1.46µs ? ?/sec
contrived/03x_entities_06_systems 1.05 86.7±5.83µs ? ?/sec 1.01 83.2±4.75µs ? ?/sec 1.00 82.3±4.09µs ? ?/sec
contrived/03x_entities_09_systems 1.01 125.5±8.38µs ? ?/sec 1.03 128.5±9.84µs ? ?/sec 1.00 124.7±5.31µs ? ?/sec
contrived/03x_entities_12_systems 1.00 160.4±3.97µs ? ?/sec 1.05 167.7±8.68µs ? ?/sec 1.02 164.0±4.07µs ? ?/sec
contrived/03x_entities_15_systems 1.03 208.9±12.52µs ? ?/sec 1.02 206.3±9.65µs ? ?/sec 1.00 202.8±8.56µs ? ?/sec
contrived/04x_entities_03_systems 1.02 54.5±2.76µs ? ?/sec 1.02 54.1±4.25µs ? ?/sec 1.00 53.1±4.04µs ? ?/sec
contrived/04x_entities_06_systems 1.05 106.4±6.15µs ? ?/sec 1.00 101.5±3.27µs ? ?/sec 1.02 103.6±5.81µs ? ?/sec
contrived/04x_entities_09_systems 1.04 160.1±10.74µs ? ?/sec 1.00 153.2±8.36µs ? ?/sec 1.01 154.7±5.95µs ? ?/sec
contrived/04x_entities_12_systems 1.00 205.8±9.71µs ? ?/sec 1.01 206.5±6.03µs ? ?/sec 1.00 205.2±8.46µs ? ?/sec
contrived/04x_entities_15_systems 1.01 251.5±9.87µs ? ?/sec 1.08 268.6±11.00µs ? ?/sec 1.00 248.4±6.81µs ? ?/sec
contrived/05x_entities_03_systems 1.00 62.7±3.15µs ? ?/sec 1.00 62.9±2.73µs ? ?/sec 1.01 63.4±3.52µs ? ?/sec
contrived/05x_entities_06_systems 1.00 126.9±5.49µs ? ?/sec 1.03 130.6±6.22µs ? ?/sec 1.00 127.2±6.37µs ? ?/sec
contrived/05x_entities_09_systems 1.00 179.5±5.15µs ? ?/sec 1.05 188.5±8.31µs ? ?/sec 1.05 187.5±6.80µs ? ?/sec
contrived/05x_entities_12_systems 1.00 244.4±7.21µs ? ?/sec 1.00 245.2±7.75µs ? ?/sec 1.04 254.6±15.85µs ? ?/sec
contrived/05x_entities_15_systems 1.01 312.3±9.83µs ? ?/sec 1.01 315.1±11.95µs ? ?/sec 1.00 310.5±10.72µs ? ?/sec
fragmented_iter/base 1.17 478.1±5.89ns ? ?/sec 1.02 416.4±18.28ns ? ?/sec 1.00 410.1±18.16ns ? ?/sec
fragmented_iter/foreach 1.00 236.3±25.69ns ? ?/sec 1.00 236.0±24.93ns ? ?/sec 1.02 241.6±29.92ns ? ?/sec
heavy_compute/base 1.00 355.8±4.53µs ? ?/sec 1.01 359.0±5.15µs ? ?/sec 1.02 364.5±5.74µs ? ?/sec
insert_commands/insert 1.02 783.8±34.26µs ? ?/sec 1.00 772.2±30.11µs ? ?/sec 1.00 774.8±33.14µs ? ?/sec
insert_commands/insert_batch 1.00 394.6±44.37µs ? ?/sec 1.03 406.7±39.73µs ? ?/sec 1.04 410.3±48.26µs ? ?/sec
query_get/50000_entities_sparse 1.11 589.0±31.37µs ? ?/sec 1.00 530.2±36.38µs ? ?/sec 2.13 1127.2±55.93µs ? ?/sec
query_get/50000_entities_table 1.00 457.1±27.07µs ? ?/sec 1.01 463.0±6.55µs ? ?/sec 1.32 601.6±12.68µs ? ?/sec
query_get_component/50000_entities_sparse 1.00 1244.8±50.69µs ? ?/sec 1.04 1289.7±74.86µs ? ?/sec 1.03 1287.5±52.23µs ? ?/sec
query_get_component/50000_entities_table 1.01 1247.2±108.41µs ? ?/sec 1.03 1273.3±90.25µs ? ?/sec 1.00 1236.8±21.18µs ? ?/sec
schedule/base 1.01 30.6±2.49µs ? ?/sec 1.03 31.2±2.25µs ? ?/sec 1.00 30.2±1.93µs ? ?/sec
simple_iter/base 1.01 13.9±0.74µs ? ?/sec 1.00 13.7±0.19µs ? ?/sec 1.00 13.7±0.17µs ? ?/sec
simple_iter/foreach 1.00 11.6±0.12µs ? ?/sec 1.00 11.6±0.15µs ? ?/sec 1.00 11.6±0.18µs ? ?/sec
simple_iter/sparse 1.00 52.0±0.22µs ? ?/sec 1.00 51.8±0.26µs ? ?/sec 1.18 61.3±0.32µs ? ?/sec
simple_iter/sparse_foreach 1.00 45.2±0.19µs ? ?/sec 1.04 46.9±0.41µs ? ?/sec 1.12 50.4±0.76µs ? ?/sec
simple_iter/system 1.00 13.7±0.29µs ? ?/sec 1.00 13.7±0.07µs ? ?/sec 1.01 13.8±0.49µs ? ?/sec
sparse_fragmented_iter/base 1.00 10.9±0.24ns ? ?/sec 1.22 13.3±0.62ns ? ?/sec 1.18 12.8±0.86ns ? ?/sec
sparse_fragmented_iter/foreach 1.00 8.9±0.22ns ? ?/sec 1.00 8.9±0.15ns ? ?/sec 1.00 8.9±0.14ns ? ?/sec
world_entity/50000_entities 1.01 426.6±0.70µs ? ?/sec 1.00 424.3±1.23µs ? ?/sec 1.00 424.3±1.15µs ? ?/sec
world_get/50000_entities_sparse 1.00 548.5±6.27µs ? ?/sec 1.04 570.1±12.14µs ? ?/sec 1.00 548.2±8.05µs ? ?/sec
world_get/50000_entities_table 1.00 916.9±13.20µs ? ?/sec 1.04 951.5±5.31µs ? ?/sec 1.01 930.5±7.69µs ? ?/sec
world_query_for_each/50000_entities_sparse 1.03 99.0±1.47µs ? ?/sec 1.03 99.0±1.27µs ? ?/sec 1.00 95.8±0.91µs ? ?/sec
world_query_for_each/50000_entities_table 1.00 27.2±0.24µs ? ?/sec 1.00 27.2±0.11µs ? ?/sec 1.00 27.2±0.10µs ? ?/sec
world_query_get/50000_entities_sparse 1.29 478.6±11.07µs ? ?/sec 1.00 372.1±6.47µs ? ?/sec 1.07 398.2±10.82µs ? ?/sec
world_query_get/50000_entities_table 1.06 274.3±4.77µs ? ?/sec 1.00 259.8±2.52µs ? ?/sec 1.05 273.4±4.26µs ? ?/sec
world_query_iter/50000_entities_sparse 1.16 114.9±0.65µs ? ?/sec 1.00 99.4±2.03µs ? ?/sec 1.03 102.8±3.31µs ? ?/sec
world_query_iter/50000_entities_table 1.00 27.3±0.78µs ? ?/sec 1.00 27.3±0.17µs ? ?/sec 1.00 27.2±0.26µs ? ?/sec
@james7132 are the Todo comments from the PR description addressed now?
@james7132 are the Todo comments from the PR description addressed now?
Yep more or less ready now.
@bevyengine/ecs-team reviews please!
Can we do some iter/get/frag_iter benchmarks of larger / more complicated queries? This (potentially) adds a branch to each Fetch impl, instead of branching once for the entire query. These redundant branches might get optimized out, but I intentionally moved that branch out to remove the (logical) O(FETCHED_ITEMS) branches. It will be hard to compare that vs main though, given the other optimizations in this pr.
It branches on a constant, just like in set_table/set_archetype, so it should mark the unmatched branch unreachable and completely remove it at compile time. Even with just a singular fetched component type, this would have seen significant perf regression if those optimizations were not present. I'll see if I can extend the existing benchmarks to use more components/filters in the queries.
As an alternative, we could make the &T and &mut T Fetch types be reliant on an associated type on Component::Storage and completely remove the need for the internal branch. However, this might be reliant on the removal of FetchState first so that the backing state of all component fetches can be ComponentId, so we could try to do the following:
pub trait ComponentStorage {
type ReadFetch: for<'a> Fetch<'a, State=ComponentId>;
type WriteFetch: for<'a> Fetch<'a, State=ComponentId>;
type ReadOnlyWriteFetch: for<'a> Fetch<'a, State=ComponentId>;
}
impl<'a, T: Component> WorldQueryGats<'a> for &T {
type ReadFetch = T::Storage::ReadFetch;
type WriteFetch = T::Storage::ReadFetch;
type ReadOnlyWriteFetch = T::Storage::ReadOnlyWriteFetch;
}
impl<'a, T> Fetch<'a> for TableReadFetch<'a, T> {
...
}
Why were so many #[inline] changed to #[inline(always)] did you benchmark this and it improved stuff?
Why were so many
#[inline]changed to#[inline(always)]did you benchmark this and it improved stuff?
The optimization strategy here strictly relies on having the fetch/filter_fetch calls inlined so that the compiler can discover that one or more of the parameters are not being used. I just didn't want to take chances there, particularly with some of these already having inlined sparse set and table accesses which could make the generated code larger and fall above the inlining threshold. I can test that if need be.
I have not run a microbenchmark with larger queries, but I know we have some really big ones in rendering and other parts of the engine, so as a sanity check. Tested it against many_cubes which has a mix of both normal iteration and heavy Query::get usage via the render phase. I also tested this PR where it uses Entity and usize directly instead of references. Here are the stage timings for comparison:
| stage | main | this PR | this PR (copy over reference) |
|---|---|---|---|
| Full Frame | 20.52ms | 19.7ms | 19.55ms |
| First | 411.71us | 401.28us | 408.13us |
| LoadAssets | 193.32us | 186.95us | 189.87us |
| PreUpdate | 95.6us | 91.18us | 94.98us |
| Update | 53.52us | 54.41us | 54.47us |
| PostUpdate | 3.32ms | 3.05ms | 3.09ms |
| AssetEvents | 185.73us | 180us | 184.05us |
| Last | 27.11us | 26.05us | 27.29us |
| Extract | 3.7ms | 3.69ms | 3.31ms |
| Prepare | 2.64ms | 2.54ms | 2.56ms |
| Queue | 956.72us | 911.44us | 929.13us |
| Sort | 993.08us | 987.89us | 984.32us |
| Render | 7.49ms | 7.12ms | 7.28ms |
The biggest wins here are in PostUpdate, which has a heavy parallel iteration via check_visibility and Render, where every visible entity has multiple Query::get calls made. Everything else is likely within the margin of error, but generally don't show any significant regression in perf. For comparison, the primary query that is being run in visible entities query is defined as:
mut visible_entity_query: Query<(
Entity,
&Visibility,
&mut ComputedVisibility,
Option<&RenderLayers>,
Option<&Aabb>,
Option<&NoFrustumCulling>,
Option<&GlobalTransform>,
)>,
This query is running in parallel, so task spawn overhead and contention notwithstanding, this query is ~10% faster with this change.
As for a more detailed explanation of why this seems to work, see #5064. In particular, this removes a bunch of the unwrap_or_else(|| debug_checked_unreachable()) calls,, which are otherwise unavoidable, with get_unchecked calls outside the fetch call. This seems to be adding quite a few more instructions, including two jumps that were otherwise supposed to be optimized out.
@cart, this is tricky but well-motivated, reviewed and benchmarked. Do you want to do a review pass on this?
I want to redo those stage timing measurements. There's been quite a few optimizations merged in since that was last measured, and I'm sure this is still not a regression, but I'd still like to double check before pulling the trigger on this.
Yeah I'd like to do a pass. I'd also still like to see a microbenchmark of large queries with many fetch calls. The microbenchmarks in this pr seems to show that main is "slightly" faster in many cases. If that "slightly" scales with query size, we'll want to weigh the Query::get wins against that cost. We can't do that without "clean" numbers.
The "many cubes" benchmark also seems roughly compatible with the "this regresses fetch for iteration" interpretation. We see nice improvements in some areas, which apparently align with heavy Query::get calls. But then we tend to see small-ish regressions everywhere else.
For the "many cubes" numbers, its hard to say how big the (potential) wins and losses are, because we're constantly interleaving query iteration (which might have regressed for large queries) and query gets (which have good evidence suggesting they got a perf boost).
Redid the many_cubes measurements. Looks to be a net gain across the board here. Included a few of the systems that are strictly iteration bound as well.
| stage/system | main | this PR |
|---|---|---|
| First | 355.95us | 333.55us |
| LoadAssets | 171.93us | 158.57us |
| PreUpdate | 92.43us | 84.61us |
| Update | 52.09us | 50.09us |
| PostUpdate | 2.06ms | 1.79ms |
| AssetEvents | 158.36us | 151.17us |
| Last | 27.79us | 25.95us |
| Extract | 3.54ms | 3.47ms |
| Prepare | 2.57ms | 2.33ms |
| Queue | 868.45us | 810.47us |
| Sort | 218.88us | 206.89us |
| Render | 7.56ms | 7.29ms |
| check_visibility | 1.3ms | 1.22ms |
| check_visibility par_for_each (1024 entities) | 14.53us | 14.51us |
| extract_meshes | 1.55ms | 1.44ms |
| extract_visible_components | 530.61us | 473.62us |
| prepare_uniform_components | 1.13ms | 1.05ms |
| full frame | 18.12ms | 17.1ms |
Redid the microbenchmarks including the ones in #5123. The results are odd. It does indeed show that even the wider queries benefit from this change. However, both of the busy_systems and contrived benchmarks consistently regressed further. I'm not sure if this due to the parallel scheduler in the mix or some other influence, because all of the other iteration/get benchmarks show the non-regression or substantially better results.
Updated Benchmarks
group cleanup-fetch main
----- ------------- ----
add_remove_component/sparse_set 1.02 1322.0±76.98µs ? ?/sec 1.00 1301.0±82.37µs ? ?/sec
add_remove_component/table 1.03 1682.7±49.97µs ? ?/sec 1.00 1629.8±35.87µs ? ?/sec
add_remove_component_big/sparse_set 1.00 1435.7±299.23µs ? ?/sec 1.03 1476.2±296.28µs ? ?/sec
add_remove_component_big/table 1.01 2.9±0.05ms ? ?/sec 1.00 2.9±0.23ms ? ?/sec
added_archetypes/archetype_count/100 1.00 186.2±10.35µs ? ?/sec 1.00 185.6±9.29µs ? ?/sec
added_archetypes/archetype_count/1000 1.00 688.2±20.12µs ? ?/sec 1.06 728.0±50.14µs ? ?/sec
added_archetypes/archetype_count/10000 1.00 14.2±1.39ms ? ?/sec 1.03 14.6±2.00ms ? ?/sec
added_archetypes/archetype_count/200 1.03 234.2±10.67µs ? ?/sec 1.00 226.6±12.16µs ? ?/sec
added_archetypes/archetype_count/2000 1.00 1355.2±33.30µs ? ?/sec 1.08 1465.4±125.18µs ? ?/sec
added_archetypes/archetype_count/500 1.00 403.4±48.12µs ? ?/sec 1.02 413.3±29.36µs ? ?/sec
added_archetypes/archetype_count/5000 1.00 4.8±0.56ms ? ?/sec 1.15 5.5±0.83ms ? ?/sec
busy_systems/01x_entities_03_systems 1.16 40.4±2.18µs ? ?/sec 1.00 34.9±1.19µs ? ?/sec
busy_systems/01x_entities_06_systems 1.17 78.1±2.53µs ? ?/sec 1.00 66.6±2.78µs ? ?/sec
busy_systems/01x_entities_09_systems 1.26 118.6±4.26µs ? ?/sec 1.00 94.0±2.86µs ? ?/sec
busy_systems/01x_entities_12_systems 1.20 147.7±4.45µs ? ?/sec 1.00 123.1±5.81µs ? ?/sec
busy_systems/01x_entities_15_systems 1.16 178.7±4.28µs ? ?/sec 1.00 153.6±4.88µs ? ?/sec
busy_systems/02x_entities_03_systems 1.26 75.5±4.10µs ? ?/sec 1.00 59.9±2.85µs ? ?/sec
busy_systems/02x_entities_06_systems 1.15 137.2±3.85µs ? ?/sec 1.00 119.0±7.39µs ? ?/sec
busy_systems/02x_entities_09_systems 1.27 217.6±6.59µs ? ?/sec 1.00 171.9±4.08µs ? ?/sec
busy_systems/02x_entities_12_systems 1.17 270.3±6.97µs ? ?/sec 1.00 230.6±8.73µs ? ?/sec
busy_systems/02x_entities_15_systems 1.20 334.5±11.35µs ? ?/sec 1.00 278.8±6.34µs ? ?/sec
busy_systems/03x_entities_03_systems 1.02 102.5±4.59µs ? ?/sec 1.00 100.3±4.70µs ? ?/sec
busy_systems/03x_entities_06_systems 1.15 201.1±8.64µs ? ?/sec 1.00 174.8±5.38µs ? ?/sec
busy_systems/03x_entities_09_systems 1.30 323.3±13.82µs ? ?/sec 1.00 247.9±6.90µs ? ?/sec
busy_systems/03x_entities_12_systems 1.21 389.5±14.12µs ? ?/sec 1.00 320.9±9.25µs ? ?/sec
busy_systems/03x_entities_15_systems 1.18 482.4±12.17µs ? ?/sec 1.00 407.1±8.99µs ? ?/sec
busy_systems/04x_entities_03_systems 1.25 138.9±7.21µs ? ?/sec 1.00 111.1±4.86µs ? ?/sec
busy_systems/04x_entities_06_systems 1.22 273.1±12.83µs ? ?/sec 1.00 223.3±6.85µs ? ?/sec
busy_systems/04x_entities_09_systems 1.24 416.1±12.86µs ? ?/sec 1.00 336.6±12.67µs ? ?/sec
busy_systems/04x_entities_12_systems 1.22 513.7±16.52µs ? ?/sec 1.00 421.3±12.56µs ? ?/sec
busy_systems/04x_entities_15_systems 1.16 615.1±26.75µs ? ?/sec 1.00 532.2±15.00µs ? ?/sec
busy_systems/05x_entities_03_systems 1.32 176.5±10.36µs ? ?/sec 1.00 133.6±4.76µs ? ?/sec
busy_systems/05x_entities_06_systems 1.40 366.8±14.17µs ? ?/sec 1.00 262.7±8.66µs ? ?/sec
busy_systems/05x_entities_09_systems 1.25 514.7±20.64µs ? ?/sec 1.00 410.9±12.45µs ? ?/sec
busy_systems/05x_entities_12_systems 1.20 659.0±22.17µs ? ?/sec 1.00 547.0±13.86µs ? ?/sec
busy_systems/05x_entities_15_systems 1.27 838.4±30.31µs ? ?/sec 1.00 659.7±18.72µs ? ?/sec
contrived/01x_entities_03_systems 1.21 27.2±0.52µs ? ?/sec 1.00 22.5±1.62µs ? ?/sec
contrived/01x_entities_06_systems 1.25 52.8±1.32µs ? ?/sec 1.00 42.4±2.02µs ? ?/sec
contrived/01x_entities_09_systems 1.21 75.1±2.46µs ? ?/sec 1.00 62.0±2.42µs ? ?/sec
contrived/01x_entities_12_systems 1.21 98.9±1.88µs ? ?/sec 1.00 81.6±4.21µs ? ?/sec
contrived/01x_entities_15_systems 1.25 125.8±3.28µs ? ?/sec 1.00 100.6±6.48µs ? ?/sec
contrived/02x_entities_03_systems 1.37 45.4±1.72µs ? ?/sec 1.00 33.1±2.56µs ? ?/sec
contrived/02x_entities_06_systems 1.21 79.0±1.47µs ? ?/sec 1.00 65.2±4.73µs ? ?/sec
contrived/02x_entities_09_systems 1.22 116.3±2.60µs ? ?/sec 1.00 95.1±4.00µs ? ?/sec
contrived/02x_entities_12_systems 1.22 152.8±4.98µs ? ?/sec 1.00 125.2±3.68µs ? ?/sec
contrived/02x_entities_15_systems 1.23 186.0±3.24µs ? ?/sec 1.00 150.7±7.38µs ? ?/sec
contrived/03x_entities_03_systems 1.31 54.8±1.61µs ? ?/sec 1.00 41.9±2.10µs ? ?/sec
contrived/03x_entities_06_systems 1.28 105.3±2.34µs ? ?/sec 1.00 82.0±2.44µs ? ?/sec
contrived/03x_entities_09_systems 1.25 157.9±3.23µs ? ?/sec 1.00 126.7±5.21µs ? ?/sec
contrived/03x_entities_12_systems 1.17 196.6±5.02µs ? ?/sec 1.00 167.6±5.79µs ? ?/sec
contrived/03x_entities_15_systems 1.13 238.8±5.80µs ? ?/sec 1.00 212.2±10.71µs ? ?/sec
contrived/04x_entities_03_systems 1.39 73.1±2.13µs ? ?/sec 1.00 52.7±2.80µs ? ?/sec
contrived/04x_entities_06_systems 1.20 133.8±3.18µs ? ?/sec 1.00 111.8±8.56µs ? ?/sec
contrived/04x_entities_09_systems 1.23 189.2±5.33µs ? ?/sec 1.00 154.0±4.34µs ? ?/sec
contrived/04x_entities_12_systems 1.17 241.6±5.61µs ? ?/sec 1.00 206.2±8.28µs ? ?/sec
contrived/04x_entities_15_systems 1.12 295.4±7.79µs ? ?/sec 1.00 262.9±9.39µs ? ?/sec
contrived/05x_entities_03_systems 1.38 84.0±2.42µs ? ?/sec 1.00 60.9±1.98µs ? ?/sec
contrived/05x_entities_06_systems 1.32 159.1±3.13µs ? ?/sec 1.00 120.1±2.42µs ? ?/sec
contrived/05x_entities_09_systems 1.32 241.3±5.63µs ? ?/sec 1.00 182.5±5.08µs ? ?/sec
contrived/05x_entities_12_systems 1.20 299.3±12.07µs ? ?/sec 1.00 248.8±10.79µs ? ?/sec
contrived/05x_entities_15_systems 1.18 362.3±8.39µs ? ?/sec 1.00 306.5±16.78µs ? ?/sec
empty_commands/0_entities 1.00 5.2±0.27ns ? ?/sec 1.00 5.2±0.27ns ? ?/sec
fake_commands/2000_commands 1.06 7.2±0.23µs ? ?/sec 1.00 6.9±0.15µs ? ?/sec
fake_commands/4000_commands 1.13 15.2±0.62µs ? ?/sec 1.00 13.4±0.12µs ? ?/sec
fake_commands/6000_commands 1.13 22.6±0.80µs ? ?/sec 1.00 20.0±0.15µs ? ?/sec
fake_commands/8000_commands 1.07 28.8±0.29µs ? ?/sec 1.00 26.8±0.19µs ? ?/sec
fragmented_iter/base 1.00 352.2±10.92ns ? ?/sec 1.34 470.4±31.26ns ? ?/sec
fragmented_iter/foreach 1.01 245.8±26.37ns ? ?/sec 1.00 242.8±23.09ns ? ?/sec
fragmented_iter/foreach_wide 1.00 4.0±0.23µs ? ?/sec 1.02 4.1±0.54µs ? ?/sec
fragmented_iter/wide 1.00 4.5±0.19µs ? ?/sec 1.34 6.1±0.21µs ? ?/sec
get_component/base 1.00 1051.9±13.21µs ? ?/sec 1.06 1112.2±43.40µs ? ?/sec
get_component/system 1.00 763.4±33.24µs ? ?/sec 1.06 806.3±22.00µs ? ?/sec
get_or_spawn/batched 1.00 419.0±53.91µs ? ?/sec 1.01 421.2±47.27µs ? ?/sec
get_or_spawn/individual 1.01 948.7±77.17µs ? ?/sec 1.00 937.8±78.09µs ? ?/sec
heavy_compute/base 1.01 361.4±3.93µs ? ?/sec 1.00 357.9±3.21µs ? ?/sec
insert_commands/insert 1.00 800.2±31.11µs ? ?/sec 1.04 830.3±99.12µs ? ?/sec
insert_commands/insert_batch 1.00 403.5±40.45µs ? ?/sec 1.02 410.4±38.46µs ? ?/sec
query_get/50000_entities_sparse 1.00 643.1±45.04µs ? ?/sec 1.99 1280.2±34.58µs ? ?/sec
query_get/50000_entities_table 1.00 577.9±10.12µs ? ?/sec 1.28 737.7±12.53µs ? ?/sec
query_get_component/50000_entities_sparse 1.01 1225.8±95.93µs ? ?/sec 1.00 1209.1±65.14µs ? ?/sec
query_get_component/50000_entities_table 1.04 1244.6±112.45µs ? ?/sec 1.00 1192.1±17.73µs ? ?/sec
simple_insert/base 1.08 619.4±94.59µs ? ?/sec 1.00 576.0±19.23µs ? ?/sec
simple_insert/unbatched 1.00 1407.8±60.24µs ? ?/sec 1.01 1419.9±33.00µs ? ?/sec
simple_iter/base 1.00 11.0±0.06µs ? ?/sec 1.25 13.7±0.10µs ? ?/sec
simple_iter/foreach 1.01 10.9±0.06µs ? ?/sec 1.00 10.8±0.10µs ? ?/sec
simple_iter/foreach_wide 1.00 44.2±0.66µs ? ?/sec 1.10 48.4±4.06µs ? ?/sec
simple_iter/sparse 1.00 47.8±0.44µs ? ?/sec 1.15 54.8±0.58µs ? ?/sec
simple_iter/sparse_foreach 1.00 43.7±2.89µs ? ?/sec 1.12 49.1±0.33µs ? ?/sec
simple_iter/sparse_foreach_wide 1.00 241.9±3.14µs ? ?/sec 1.08 262.1±4.61µs ? ?/sec
simple_iter/sparse_wide 1.00 252.6±2.16µs ? ?/sec 1.11 281.6±4.89µs ? ?/sec
simple_iter/system 1.00 11.0±0.13µs ? ?/sec 1.25 13.7±0.15µs ? ?/sec
simple_iter/wide 1.00 55.5±0.22µs ? ?/sec 1.16 64.3±0.52µs ? ?/sec
sized_commands_0_bytes/2000_commands 1.13 5.1±0.08µs ? ?/sec 1.00 4.5±0.03µs ? ?/sec
sized_commands_0_bytes/4000_commands 1.11 10.2±0.09µs ? ?/sec 1.00 9.1±0.06µs ? ?/sec
sized_commands_0_bytes/6000_commands 1.13 15.4±0.20µs ? ?/sec 1.00 13.7±0.11µs ? ?/sec
sized_commands_0_bytes/8000_commands 1.12 20.4±0.28µs ? ?/sec 1.00 18.3±0.22µs ? ?/sec
sized_commands_12_bytes/2000_commands 1.00 7.1±0.06µs ? ?/sec 1.00 7.1±0.04µs ? ?/sec
sized_commands_12_bytes/4000_commands 1.00 14.4±0.07µs ? ?/sec 1.01 14.6±0.23µs ? ?/sec
sized_commands_12_bytes/6000_commands 1.01 22.1±0.33µs ? ?/sec 1.00 21.8±0.31µs ? ?/sec
sized_commands_12_bytes/8000_commands 1.00 28.9±0.40µs ? ?/sec 1.00 28.9±0.44µs ? ?/sec
sized_commands_512_bytes/2000_commands 1.00 108.9±2.50µs ? ?/sec 1.02 110.8±2.61µs ? ?/sec
sized_commands_512_bytes/4000_commands 1.00 224.9±23.23µs ? ?/sec 1.01 226.1±12.77µs ? ?/sec
sized_commands_512_bytes/6000_commands 1.00 344.3±44.37µs ? ?/sec 1.01 347.7±45.95µs ? ?/sec
sized_commands_512_bytes/8000_commands 1.01 471.0±74.12µs ? ?/sec 1.00 466.3±58.84µs ? ?/sec
sparse_fragmented_iter/base 1.02 11.7±1.03ns ? ?/sec 1.00 11.6±0.57ns ? ?/sec
sparse_fragmented_iter/foreach 1.00 9.0±0.32ns ? ?/sec 1.03 9.2±0.24ns ? ?/sec
sparse_fragmented_iter/foreach_wide 1.00 42.1±1.16ns ? ?/sec 1.06 44.5±5.25ns ? ?/sec
sparse_fragmented_iter/wide 1.00 72.8±5.68ns ? ?/sec 1.18 85.9±5.85ns ? ?/sec
spawn_commands/2000_entities 1.06 275.0±38.92µs ? ?/sec 1.00 259.1±22.15µs ? ?/sec
spawn_commands/4000_entities 1.00 518.1±42.46µs ? ?/sec 1.00 518.6±30.19µs ? ?/sec
spawn_commands/6000_entities 1.03 777.7±83.67µs ? ?/sec 1.00 753.6±54.53µs ? ?/sec
spawn_commands/8000_entities 1.01 1003.0±97.98µs ? ?/sec 1.00 996.2±96.49µs ? ?/sec
world_entity/50000_entities 1.00 425.2±2.51µs ? ?/sec 1.01 427.5±0.97µs ? ?/sec
world_get/50000_entities_sparse 1.00 562.1±9.75µs ? ?/sec 1.03 578.7±5.37µs ? ?/sec
world_get/50000_entities_table 1.00 911.2±15.13µs ? ?/sec 1.03 943.0±8.83µs ? ?/sec
world_query_for_each/50000_entities_sparse 1.00 84.5±0.66µs ? ?/sec 1.00 84.1±2.73µs ? ?/sec
world_query_for_each/50000_entities_table 1.00 27.2±0.14µs ? ?/sec 1.00 27.2±0.13µs ? ?/sec
world_query_get/50000_entities_sparse 1.00 464.8±8.55µs ? ?/sec 1.04 483.1±4.63µs ? ?/sec
world_query_get/50000_entities_sparse_wide 1.00 1420.2±53.48µs ? ?/sec 1.07 1518.6±28.91µs ? ?/sec
world_query_get/50000_entities_table 1.00 402.2±3.20µs ? ?/sec 1.09 437.5±4.37µs ? ?/sec
world_query_get/50000_entities_table_wide 1.01 811.5±6.02µs ? ?/sec 1.00 804.2±8.17µs ? ?/sec
world_query_iter/50000_entities_sparse 1.01 97.9±8.14µs ? ?/sec 1.00 96.6±1.41µs ? ?/sec
world_query_iter/50000_entities_table 1.00 27.2±0.45µs ? ?/sec 1.01 27.5±0.42µs ? ?/sec
Further cross checking this against the results for before and after switching to values over references 4d27afc, the change here does seem to line up, and it's the only other real substantive change since the last microbenchmark. Rerunning the microbenchmarks with that change reverted.
Completed both benchmarks and a timing test against many_cubes and it does seem like it's coming from that change. The stage timings seem to suggest that Query::get is indeed faster with copied values over references, which is making Render faster (it's where the 0.3ms difference is likely coming from), but the microbenchmark seems to show the aforementioned regression. I think we should stick with the current pass-by-value results given the more practical stage timings, but I'll leave the decision up to @cart as to which one we should trust.
`many_cubes` stage timings
| stage/system | main | this PR | this PR (reference) |
|---|---|---|---|
| First | 355.95us | 344.28us | 342.84us |
| LoadAssets | 171.93us | 163.25us | 163.09us |
| PreUpdate | 92.43us | 85.53us | 84.4us |
| Update | 52.09us | 50.1us | 48.45us |
| PostUpdate | 2.06ms | 1.83ms | 1.83ms |
| AssetEvents | 158.36us | 153.82us | 153.35us |
| Last | 27.79us | 27.83us | 25.3us |
| Extract | 3.54ms | 3.4ms | 3.43ms |
| Prepare | 2.57ms | 2.34ms | 2.31ms |
| Queue | 868.45us | 825.1us | 827.37us |
| Sort | 218.88us | 209.41us | 206.23us |
| Render | 7.56ms | 7.27ms | 7.59ms |
| check_visibility | 1.3ms | 1.24ms | 1.25ms |
| check_visibility par_for_each (1024 entities) | 14.53us | 14.59us | 14.31us |
| extract_meshes | 1.55ms | 1.48ms | 1.42ms |
| extract_visible_components | 530.61us | 478.62us | 469.55us |
| prepare_uniform_components | 1.13ms | 1.03ms | 1.04ms |
| full frame | 18.12ms | 17.08ms | 17.42ms |
Microbenchmark Results
group cleanup-fetch cleanup-fetch-reference main
----- ------------- ----------------------- ----
add_remove_component/sparse_set 1.02 1322.0±76.98µs ? ?/sec 1.01 1309.9±74.91µs ? ?/sec 1.00 1301.0±82.37µs ? ?/sec
add_remove_component/table 1.03 1682.7±49.97µs ? ?/sec 1.05 1711.4±52.17µs ? ?/sec 1.00 1629.8±35.87µs ? ?/sec
add_remove_component_big/sparse_set 1.00 1435.7±299.23µs ? ?/sec 1.01 1453.0±256.37µs ? ?/sec 1.03 1476.2±296.28µs ? ?/sec
add_remove_component_big/table 1.01 2.9±0.05ms ? ?/sec 1.01 2.9±0.19ms ? ?/sec 1.00 2.9±0.23ms ? ?/sec
added_archetypes/archetype_count/100 1.10 186.2±10.35µs ? ?/sec 1.00 169.6±12.51µs ? ?/sec 1.09 185.6±9.29µs ? ?/sec
added_archetypes/archetype_count/1000 1.00 688.2±20.12µs ? ?/sec 1.04 714.3±15.19µs ? ?/sec 1.06 728.0±50.14µs ? ?/sec
added_archetypes/archetype_count/10000 1.05 14.2±1.39ms ? ?/sec 1.00 13.6±1.15ms ? ?/sec 1.07 14.6±2.00ms ? ?/sec
added_archetypes/archetype_count/200 1.03 234.2±10.67µs ? ?/sec 1.05 236.9±10.99µs ? ?/sec 1.00 226.6±12.16µs ? ?/sec
added_archetypes/archetype_count/2000 1.00 1355.2±33.30µs ? ?/sec 1.03 1391.9±51.10µs ? ?/sec 1.08 1465.4±125.18µs ? ?/sec
added_archetypes/archetype_count/500 1.00 403.4±48.12µs ? ?/sec 1.03 414.0±10.68µs ? ?/sec 1.02 413.3±29.36µs ? ?/sec
added_archetypes/archetype_count/5000 1.01 4.8±0.56ms ? ?/sec 1.00 4.8±0.37ms ? ?/sec 1.16 5.5±0.83ms ? ?/sec
busy_systems/01x_entities_03_systems 1.27 40.4±2.18µs ? ?/sec 1.00 31.9±1.04µs ? ?/sec 1.09 34.9±1.19µs ? ?/sec
busy_systems/01x_entities_06_systems 1.19 78.1±2.53µs ? ?/sec 1.00 65.7±2.16µs ? ?/sec 1.01 66.6±2.78µs ? ?/sec
busy_systems/01x_entities_09_systems 1.26 118.6±4.26µs ? ?/sec 1.06 99.3±2.69µs ? ?/sec 1.00 94.0±2.86µs ? ?/sec
busy_systems/01x_entities_12_systems 1.20 147.7±4.45µs ? ?/sec 1.03 127.0±6.14µs ? ?/sec 1.00 123.1±5.81µs ? ?/sec
busy_systems/01x_entities_15_systems 1.16 178.7±4.28µs ? ?/sec 1.03 158.3±5.73µs ? ?/sec 1.00 153.6±4.88µs ? ?/sec
busy_systems/02x_entities_03_systems 1.26 75.5±4.10µs ? ?/sec 1.04 62.2±3.28µs ? ?/sec 1.00 59.9±2.85µs ? ?/sec
busy_systems/02x_entities_06_systems 1.21 137.2±3.85µs ? ?/sec 1.00 113.2±4.10µs ? ?/sec 1.05 119.0±7.39µs ? ?/sec
busy_systems/02x_entities_09_systems 1.30 217.6±6.59µs ? ?/sec 1.00 167.0±4.14µs ? ?/sec 1.03 171.9±4.08µs ? ?/sec
busy_systems/02x_entities_12_systems 1.18 270.3±6.97µs ? ?/sec 1.00 228.4±10.32µs ? ?/sec 1.01 230.6±8.73µs ? ?/sec
busy_systems/02x_entities_15_systems 1.20 334.5±11.35µs ? ?/sec 1.02 283.8±7.82µs ? ?/sec 1.00 278.8±6.34µs ? ?/sec
busy_systems/03x_entities_03_systems 1.17 102.5±4.59µs ? ?/sec 1.00 88.0±5.52µs ? ?/sec 1.14 100.3±4.70µs ? ?/sec
busy_systems/03x_entities_06_systems 1.21 201.1±8.64µs ? ?/sec 1.00 166.1±5.74µs ? ?/sec 1.05 174.8±5.38µs ? ?/sec
busy_systems/03x_entities_09_systems 1.30 323.3±13.82µs ? ?/sec 1.01 250.8±7.15µs ? ?/sec 1.00 247.9±6.90µs ? ?/sec
busy_systems/03x_entities_12_systems 1.21 389.5±14.12µs ? ?/sec 1.05 336.7±11.48µs ? ?/sec 1.00 320.9±9.25µs ? ?/sec
busy_systems/03x_entities_15_systems 1.18 482.4±12.17µs ? ?/sec 1.00 408.0±9.53µs ? ?/sec 1.00 407.1±8.99µs ? ?/sec
busy_systems/04x_entities_03_systems 1.25 138.9±7.21µs ? ?/sec 1.03 114.1±3.90µs ? ?/sec 1.00 111.1±4.86µs ? ?/sec
busy_systems/04x_entities_06_systems 1.22 273.1±12.83µs ? ?/sec 1.01 225.6±12.09µs ? ?/sec 1.00 223.3±6.85µs ? ?/sec
busy_systems/04x_entities_09_systems 1.27 416.1±12.86µs ? ?/sec 1.00 327.7±11.90µs ? ?/sec 1.03 336.6±12.67µs ? ?/sec
busy_systems/04x_entities_12_systems 1.22 513.7±16.52µs ? ?/sec 1.04 439.9±15.52µs ? ?/sec 1.00 421.3±12.56µs ? ?/sec
busy_systems/04x_entities_15_systems 1.16 615.1±26.75µs ? ?/sec 1.04 551.2±23.65µs ? ?/sec 1.00 532.2±15.00µs ? ?/sec
busy_systems/05x_entities_03_systems 1.32 176.5±10.36µs ? ?/sec 1.17 156.4±8.72µs ? ?/sec 1.00 133.6±4.76µs ? ?/sec
busy_systems/05x_entities_06_systems 1.40 366.8±14.17µs ? ?/sec 1.06 279.3±14.23µs ? ?/sec 1.00 262.7±8.66µs ? ?/sec
busy_systems/05x_entities_09_systems 1.25 514.7±20.64µs ? ?/sec 1.03 422.6±29.00µs ? ?/sec 1.00 410.9±12.45µs ? ?/sec
busy_systems/05x_entities_12_systems 1.20 659.0±22.17µs ? ?/sec 1.02 556.9±22.44µs ? ?/sec 1.00 547.0±13.86µs ? ?/sec
busy_systems/05x_entities_15_systems 1.27 838.4±30.31µs ? ?/sec 1.02 675.9±21.57µs ? ?/sec 1.00 659.7±18.72µs ? ?/sec
contrived/01x_entities_03_systems 1.21 27.2±0.52µs ? ?/sec 1.04 23.5±1.38µs ? ?/sec 1.00 22.5±1.62µs ? ?/sec
contrived/01x_entities_06_systems 1.25 52.8±1.32µs ? ?/sec 1.09 46.4±3.15µs ? ?/sec 1.00 42.4±2.02µs ? ?/sec
contrived/01x_entities_09_systems 1.21 75.1±2.46µs ? ?/sec 1.07 66.4±2.19µs ? ?/sec 1.00 62.0±2.42µs ? ?/sec
contrived/01x_entities_12_systems 1.21 98.9±1.88µs ? ?/sec 1.06 86.7±5.42µs ? ?/sec 1.00 81.6±4.21µs ? ?/sec
contrived/01x_entities_15_systems 1.25 125.8±3.28µs ? ?/sec 1.02 103.0±5.06µs ? ?/sec 1.00 100.6±6.48µs ? ?/sec
contrived/02x_entities_03_systems 1.37 45.4±1.72µs ? ?/sec 1.09 36.1±1.23µs ? ?/sec 1.00 33.1±2.56µs ? ?/sec
contrived/02x_entities_06_systems 1.21 79.0±1.47µs ? ?/sec 1.05 68.2±3.08µs ? ?/sec 1.00 65.2±4.73µs ? ?/sec
contrived/02x_entities_09_systems 1.22 116.3±2.60µs ? ?/sec 1.06 101.3±4.39µs ? ?/sec 1.00 95.1±4.00µs ? ?/sec
contrived/02x_entities_12_systems 1.22 152.8±4.98µs ? ?/sec 1.03 128.7±9.98µs ? ?/sec 1.00 125.2±3.68µs ? ?/sec
contrived/02x_entities_15_systems 1.23 186.0±3.24µs ? ?/sec 1.05 157.7±9.87µs ? ?/sec 1.00 150.7±7.38µs ? ?/sec
contrived/03x_entities_03_systems 1.31 54.8±1.61µs ? ?/sec 1.03 43.0±3.19µs ? ?/sec 1.00 41.9±2.10µs ? ?/sec
contrived/03x_entities_06_systems 1.28 105.3±2.34µs ? ?/sec 1.09 89.5±4.53µs ? ?/sec 1.00 82.0±2.44µs ? ?/sec
contrived/03x_entities_09_systems 1.25 157.9±3.23µs ? ?/sec 1.12 142.2±2.78µs ? ?/sec 1.00 126.7±5.21µs ? ?/sec
contrived/03x_entities_12_systems 1.17 196.6±5.02µs ? ?/sec 1.09 182.7±7.39µs ? ?/sec 1.00 167.6±5.79µs ? ?/sec
contrived/03x_entities_15_systems 1.13 238.8±5.80µs ? ?/sec 1.00 211.0±4.92µs ? ?/sec 1.01 212.2±10.71µs ? ?/sec
contrived/04x_entities_03_systems 1.39 73.1±2.13µs ? ?/sec 1.10 58.1±2.11µs ? ?/sec 1.00 52.7±2.80µs ? ?/sec
contrived/04x_entities_06_systems 1.23 133.8±3.18µs ? ?/sec 1.00 109.1±8.09µs ? ?/sec 1.03 111.8±8.56µs ? ?/sec
contrived/04x_entities_09_systems 1.23 189.2±5.33µs ? ?/sec 1.04 160.0±5.09µs ? ?/sec 1.00 154.0±4.34µs ? ?/sec
contrived/04x_entities_12_systems 1.17 241.6±5.61µs ? ?/sec 1.04 214.5±6.91µs ? ?/sec 1.00 206.2±8.28µs ? ?/sec
contrived/04x_entities_15_systems 1.12 295.4±7.79µs ? ?/sec 1.02 268.6±5.91µs ? ?/sec 1.00 262.9±9.39µs ? ?/sec
contrived/05x_entities_03_systems 1.38 84.0±2.42µs ? ?/sec 1.07 65.1±3.21µs ? ?/sec 1.00 60.9±1.98µs ? ?/sec
contrived/05x_entities_06_systems 1.32 159.1±3.13µs ? ?/sec 1.07 128.8±5.30µs ? ?/sec 1.00 120.1±2.42µs ? ?/sec
contrived/05x_entities_09_systems 1.32 241.3±5.63µs ? ?/sec 1.04 190.3±6.25µs ? ?/sec 1.00 182.5±5.08µs ? ?/sec
contrived/05x_entities_12_systems 1.20 299.3±12.07µs ? ?/sec 1.01 251.3±6.89µs ? ?/sec 1.00 248.8±10.79µs ? ?/sec
contrived/05x_entities_15_systems 1.18 362.3±8.39µs ? ?/sec 1.00 306.2±9.39µs ? ?/sec 1.00 306.5±16.78µs ? ?/sec
fake_commands/2000_commands 1.06 7.2±0.23µs ? ?/sec 1.10 7.6±0.10µs ? ?/sec 1.00 6.9±0.15µs ? ?/sec
fake_commands/4000_commands 1.13 15.2±0.62µs ? ?/sec 1.08 14.4±0.15µs ? ?/sec 1.00 13.4±0.12µs ? ?/sec
fake_commands/6000_commands 1.13 22.6±0.80µs ? ?/sec 1.09 21.7±0.27µs ? ?/sec 1.00 20.0±0.15µs ? ?/sec
fake_commands/8000_commands 1.07 28.8±0.29µs ? ?/sec 1.08 28.9±0.48µs ? ?/sec 1.00 26.8±0.19µs ? ?/sec
fragmented_iter/base 1.00 352.2±10.92ns ? ?/sec 1.00 350.7±11.30ns ? ?/sec 1.34 470.4±31.26ns ? ?/sec
fragmented_iter/foreach 1.02 245.8±26.37ns ? ?/sec 1.00 241.9±23.40ns ? ?/sec 1.00 242.8±23.09ns ? ?/sec
fragmented_iter/foreach_wide 1.01 4.0±0.23µs ? ?/sec 1.00 4.0±0.13µs ? ?/sec 1.03 4.1±0.54µs ? ?/sec
fragmented_iter/wide 1.02 4.5±0.19µs ? ?/sec 1.00 4.5±0.10µs ? ?/sec 1.37 6.1±0.21µs ? ?/sec
get_component/base 1.00 1051.9±13.21µs ? ?/sec 1.02 1070.2±32.42µs ? ?/sec 1.06 1112.2±43.40µs ? ?/sec
get_component/system 1.01 763.4±33.24µs ? ?/sec 1.00 753.0±7.72µs ? ?/sec 1.07 806.3±22.00µs ? ?/sec
get_or_spawn/batched 1.01 419.0±53.91µs ? ?/sec 1.00 414.8±37.75µs ? ?/sec 1.02 421.2±47.27µs ? ?/sec
get_or_spawn/individual 1.03 948.7±77.17µs ? ?/sec 1.00 922.5±69.75µs ? ?/sec 1.02 937.8±78.09µs ? ?/sec
heavy_compute/base 1.01 361.4±3.93µs ? ?/sec 1.00 356.5±2.61µs ? ?/sec 1.00 357.9±3.21µs ? ?/sec
insert_commands/insert 1.03 800.2±31.11µs ? ?/sec 1.00 775.3±34.02µs ? ?/sec 1.07 830.3±99.12µs ? ?/sec
insert_commands/insert_batch 1.00 403.5±40.45µs ? ?/sec 1.00 401.7±30.43µs ? ?/sec 1.02 410.4±38.46µs ? ?/sec
query_get/50000_entities_sparse 1.00 643.1±45.04µs ? ?/sec 1.01 647.3±56.03µs ? ?/sec 1.99 1280.2±34.58µs ? ?/sec
query_get/50000_entities_table 1.03 577.9±10.12µs ? ?/sec 1.00 558.9±14.07µs ? ?/sec 1.32 737.7±12.53µs ? ?/sec
query_get_component/50000_entities_sparse 1.02 1225.8±95.93µs ? ?/sec 1.00 1205.9±41.28µs ? ?/sec 1.00 1209.1±65.14µs ? ?/sec
query_get_component/50000_entities_table 1.04 1244.6±112.45µs ? ?/sec 1.01 1205.1±21.94µs ? ?/sec 1.00 1192.1±17.73µs ? ?/sec
simple_iter/base 1.00 11.0±0.06µs ? ?/sec 1.03 11.3±0.20µs ? ?/sec 1.25 13.7±0.10µs ? ?/sec
simple_iter/foreach 1.01 10.9±0.06µs ? ?/sec 1.01 10.9±0.11µs ? ?/sec 1.00 10.8±0.10µs ? ?/sec
simple_iter/foreach_wide 1.00 44.2±0.66µs ? ?/sec 1.00 44.1±0.15µs ? ?/sec 1.10 48.4±4.06µs ? ?/sec
simple_iter/sparse 1.00 47.8±0.44µs ? ?/sec 1.00 47.6±0.55µs ? ?/sec 1.15 54.8±0.58µs ? ?/sec
simple_iter/sparse_foreach 1.01 43.7±2.89µs ? ?/sec 1.00 43.0±0.51µs ? ?/sec 1.14 49.1±0.33µs ? ?/sec
simple_iter/sparse_foreach_wide 1.00 241.9±3.14µs ? ?/sec 1.00 241.9±1.55µs ? ?/sec 1.08 262.1±4.61µs ? ?/sec
simple_iter/sparse_wide 1.00 252.6±2.16µs ? ?/sec 1.04 263.6±59.57µs ? ?/sec 1.11 281.6±4.89µs ? ?/sec
simple_iter/system 1.00 11.0±0.13µs ? ?/sec 1.00 11.0±0.14µs ? ?/sec 1.25 13.7±0.15µs ? ?/sec
simple_iter/wide 1.01 55.5±0.22µs ? ?/sec 1.00 54.8±0.34µs ? ?/sec 1.17 64.3±0.52µs ? ?/sec
sized_commands_0_bytes/2000_commands 1.13 5.1±0.08µs ? ?/sec 1.11 5.1±0.04µs ? ?/sec 1.00 4.5±0.03µs ? ?/sec
sized_commands_0_bytes/4000_commands 1.11 10.2±0.09µs ? ?/sec 1.11 10.1±0.25µs ? ?/sec 1.00 9.1±0.06µs ? ?/sec
sized_commands_0_bytes/6000_commands 1.13 15.4±0.20µs ? ?/sec 1.11 15.2±0.05µs ? ?/sec 1.00 13.7±0.11µs ? ?/sec
sized_commands_0_bytes/8000_commands 1.12 20.4±0.28µs ? ?/sec 1.11 20.3±0.08µs ? ?/sec 1.00 18.3±0.22µs ? ?/sec
sized_commands_12_bytes/2000_commands 1.00 7.1±0.06µs ? ?/sec 1.00 7.1±0.03µs ? ?/sec 1.00 7.1±0.04µs ? ?/sec
sized_commands_12_bytes/4000_commands 1.00 14.4±0.07µs ? ?/sec 1.00 14.5±0.06µs ? ?/sec 1.01 14.6±0.23µs ? ?/sec
sized_commands_12_bytes/6000_commands 1.02 22.1±0.33µs ? ?/sec 1.00 21.7±0.09µs ? ?/sec 1.00 21.8±0.31µs ? ?/sec
sized_commands_12_bytes/8000_commands 1.01 28.9±0.40µs ? ?/sec 1.00 28.8±0.11µs ? ?/sec 1.00 28.9±0.44µs ? ?/sec
sized_commands_512_bytes/2000_commands 1.07 108.9±2.50µs ? ?/sec 1.00 101.8±3.16µs ? ?/sec 1.09 110.8±2.61µs ? ?/sec
sized_commands_512_bytes/4000_commands 1.08 224.9±23.23µs ? ?/sec 1.00 208.8±17.81µs ? ?/sec 1.08 226.1±12.77µs ? ?/sec
sized_commands_512_bytes/6000_commands 1.08 344.3±44.37µs ? ?/sec 1.00 318.5±36.79µs ? ?/sec 1.09 347.7±45.95µs ? ?/sec
sized_commands_512_bytes/8000_commands 1.09 471.0±74.12µs ? ?/sec 1.00 431.1±60.04µs ? ?/sec 1.08 466.3±58.84µs ? ?/sec
sparse_fragmented_iter/base 1.08 11.7±1.03ns ? ?/sec 1.00 10.8±0.60ns ? ?/sec 1.07 11.6±0.57ns ? ?/sec
sparse_fragmented_iter/foreach 1.00 9.0±0.32ns ? ?/sec 1.01 9.1±0.61ns ? ?/sec 1.03 9.2±0.24ns ? ?/sec
sparse_fragmented_iter/foreach_wide 1.00 42.1±1.16ns ? ?/sec 1.01 42.5±5.03ns ? ?/sec 1.06 44.5±5.25ns ? ?/sec
sparse_fragmented_iter/wide 1.00 72.8±5.68ns ? ?/sec 1.00 72.7±3.58ns ? ?/sec 1.18 85.9±5.85ns ? ?/sec
spawn_commands/2000_entities 1.09 275.0±38.92µs ? ?/sec 1.00 253.0±17.34µs ? ?/sec 1.02 259.1±22.15µs ? ?/sec
spawn_commands/4000_entities 1.02 518.1±42.46µs ? ?/sec 1.00 507.5±33.03µs ? ?/sec 1.02 518.6±30.19µs ? ?/sec
spawn_commands/6000_entities 1.03 777.7±83.67µs ? ?/sec 1.03 774.4±80.24µs ? ?/sec 1.00 753.6±54.53µs ? ?/sec
spawn_commands/8000_entities 1.02 1003.0±97.98µs ? ?/sec 1.00 983.7±74.44µs ? ?/sec 1.01 996.2±96.49µs ? ?/sec
world_entity/50000_entities 1.00 425.2±2.51µs ? ?/sec 1.00 424.8±0.65µs ? ?/sec 1.01 427.5±0.97µs ? ?/sec
world_get/50000_entities_sparse 1.00 562.1±9.75µs ? ?/sec 1.03 581.6±19.74µs ? ?/sec 1.03 578.7±5.37µs ? ?/sec
world_get/50000_entities_table 1.00 911.2±15.13µs ? ?/sec 1.00 912.7±7.54µs ? ?/sec 1.03 943.0±8.83µs ? ?/sec
world_query_for_each/50000_entities_sparse 1.00 84.5±0.66µs ? ?/sec 1.01 84.6±1.09µs ? ?/sec 1.00 84.1±2.73µs ? ?/sec
world_query_for_each/50000_entities_table 1.00 27.2±0.14µs ? ?/sec 1.00 27.2±0.13µs ? ?/sec 1.00 27.2±0.13µs ? ?/sec
world_query_get/50000_entities_sparse 1.00 464.8±8.55µs ? ?/sec 1.00 464.5±3.51µs ? ?/sec 1.04 483.1±4.63µs ? ?/sec
world_query_get/50000_entities_sparse_wide 1.00 1420.2±53.48µs ? ?/sec 1.01 1437.3±37.08µs ? ?/sec 1.07 1518.6±28.91µs ? ?/sec
world_query_get/50000_entities_table 1.00 402.2±3.20µs ? ?/sec 1.00 403.1±9.38µs ? ?/sec 1.09 437.5±4.37µs ? ?/sec
world_query_get/50000_entities_table_wide 1.01 811.5±6.02µs ? ?/sec 1.01 812.6±5.39µs ? ?/sec 1.00 804.2±8.17µs ? ?/sec
world_query_iter/50000_entities_sparse 1.02 97.9±8.14µs ? ?/sec 1.00 96.1±1.48µs ? ?/sec 1.00 96.6±1.41µs ? ?/sec
world_query_iter/50000_entities_table 1.00 27.2±0.45µs ? ?/sec 1.00 27.2±0.26µs ? ?/sec 1.01 27.5±0.42µs ? ?/sec
One more thing I noticed is that this PR, regardless of which version (reference or value), substantially closes the gap between Query::iter and Query::for_each on the fragmented iteration cases. This does seem to suggest that there are just a few extra blockers before the two are effectively equivalent, which would make #4060 viable to merge.
Seeing how eagerly fetching entities/rows speeds up iteration, I noticed the same pattern when using set_archetype to fetch the table: we're passing in a &Tables reference that each Fetch is individually requerying for the same table. Changed set_archetype to take a Table instead of Tables, and predictably, both Query::get and fragmented iteration both saw another jump in perf.
Microbenchmark Results
group cleanup-fetch main more-cleanup
----- ------------- ---- ------------
add_remove_component/sparse_set 1.02 1322.0±76.98µs ? ?/sec 1.00 1301.0±82.37µs ? ?/sec 1.01 1311.7±76.75µs ? ?/sec
add_remove_component/table 1.03 1682.7±49.97µs ? ?/sec 1.00 1629.8±35.87µs ? ?/sec 1.04 1688.4±48.99µs ? ?/sec
add_remove_component_big/sparse_set 1.00 1435.7±299.23µs ? ?/sec 1.03 1476.2±296.28µs ? ?/sec 1.03 1472.5±317.71µs ? ?/sec
add_remove_component_big/table 1.01 2.9±0.05ms ? ?/sec 1.00 2.9±0.23ms ? ?/sec 1.00 2.9±0.19ms ? ?/sec
added_archetypes/archetype_count/100 1.00 186.2±10.35µs ? ?/sec 1.00 185.6±9.29µs ? ?/sec 1.00 185.7±9.51µs ? ?/sec
added_archetypes/archetype_count/1000 1.00 688.2±20.12µs ? ?/sec 1.06 728.0±50.14µs ? ?/sec 1.01 692.8±10.88µs ? ?/sec
added_archetypes/archetype_count/10000 1.01 14.2±1.39ms ? ?/sec 1.04 14.6±2.00ms ? ?/sec 1.00 14.0±1.53ms ? ?/sec
added_archetypes/archetype_count/200 1.03 234.2±10.67µs ? ?/sec 1.00 226.6±12.16µs ? ?/sec 1.01 228.8±11.26µs ? ?/sec
added_archetypes/archetype_count/2000 1.01 1355.2±33.30µs ? ?/sec 1.09 1465.4±125.18µs ? ?/sec 1.00 1344.3±31.27µs ? ?/sec
added_archetypes/archetype_count/500 1.01 403.4±48.12µs ? ?/sec 1.03 413.3±29.36µs ? ?/sec 1.00 400.8±26.20µs ? ?/sec
added_archetypes/archetype_count/5000 1.09 4.8±0.56ms ? ?/sec 1.25 5.5±0.83ms ? ?/sec 1.00 4.4±0.50ms ? ?/sec
busy_systems/01x_entities_03_systems 1.16 40.4±2.18µs ? ?/sec 1.00 34.9±1.19µs ? ?/sec 1.01 35.2±3.01µs ? ?/sec
busy_systems/01x_entities_06_systems 1.22 78.1±2.53µs ? ?/sec 1.04 66.6±2.78µs ? ?/sec 1.00 64.1±2.56µs ? ?/sec
busy_systems/01x_entities_09_systems 1.26 118.6±4.26µs ? ?/sec 1.00 94.0±2.86µs ? ?/sec 1.04 97.8±2.52µs ? ?/sec
busy_systems/01x_entities_12_systems 1.20 147.7±4.45µs ? ?/sec 1.00 123.1±5.81µs ? ?/sec 1.02 125.9±5.54µs ? ?/sec
busy_systems/01x_entities_15_systems 1.16 178.7±4.28µs ? ?/sec 1.00 153.6±4.88µs ? ?/sec 1.03 158.3±7.83µs ? ?/sec
busy_systems/02x_entities_03_systems 1.26 75.5±4.10µs ? ?/sec 1.00 59.9±2.85µs ? ?/sec 1.02 61.3±3.69µs ? ?/sec
busy_systems/02x_entities_06_systems 1.17 137.2±3.85µs ? ?/sec 1.02 119.0±7.39µs ? ?/sec 1.00 116.8±5.85µs ? ?/sec
busy_systems/02x_entities_09_systems 1.27 217.6±6.59µs ? ?/sec 1.00 171.9±4.08µs ? ?/sec 1.03 176.5±4.26µs ? ?/sec
busy_systems/02x_entities_12_systems 1.17 270.3±6.97µs ? ?/sec 1.00 230.6±8.73µs ? ?/sec 1.00 230.4±7.77µs ? ?/sec
busy_systems/02x_entities_15_systems 1.20 334.5±11.35µs ? ?/sec 1.00 278.8±6.34µs ? ?/sec 1.03 287.7±7.33µs ? ?/sec
busy_systems/03x_entities_03_systems 1.13 102.5±4.59µs ? ?/sec 1.10 100.3±4.70µs ? ?/sec 1.00 90.9±5.51µs ? ?/sec
busy_systems/03x_entities_06_systems 1.18 201.1±8.64µs ? ?/sec 1.02 174.8±5.38µs ? ?/sec 1.00 170.7±6.05µs ? ?/sec
busy_systems/03x_entities_09_systems 1.32 323.3±13.82µs ? ?/sec 1.01 247.9±6.90µs ? ?/sec 1.00 245.4±9.12µs ? ?/sec
busy_systems/03x_entities_12_systems 1.21 389.5±14.12µs ? ?/sec 1.00 320.9±9.25µs ? ?/sec 1.01 324.5±11.89µs ? ?/sec
busy_systems/03x_entities_15_systems 1.18 482.4±12.17µs ? ?/sec 1.00 407.1±8.99µs ? ?/sec 1.01 410.3±16.19µs ? ?/sec
busy_systems/04x_entities_03_systems 1.25 138.9±7.21µs ? ?/sec 1.00 111.1±4.86µs ? ?/sec 1.02 112.8±8.70µs ? ?/sec
busy_systems/04x_entities_06_systems 1.22 273.1±12.83µs ? ?/sec 1.00 223.3±6.85µs ? ?/sec 1.01 226.2±10.57µs ? ?/sec
busy_systems/04x_entities_09_systems 1.28 416.1±12.86µs ? ?/sec 1.03 336.6±12.67µs ? ?/sec 1.00 326.1±21.51µs ? ?/sec
busy_systems/04x_entities_12_systems 1.24 513.7±16.52µs ? ?/sec 1.02 421.3±12.56µs ? ?/sec 1.00 414.8±10.25µs ? ?/sec
busy_systems/04x_entities_15_systems 1.17 615.1±26.75µs ? ?/sec 1.01 532.2±15.00µs ? ?/sec 1.00 525.6±15.36µs ? ?/sec
busy_systems/05x_entities_03_systems 1.32 176.5±10.36µs ? ?/sec 1.00 133.6±4.76µs ? ?/sec 1.04 138.4±5.93µs ? ?/sec
busy_systems/05x_entities_06_systems 1.40 366.8±14.17µs ? ?/sec 1.00 262.7±8.66µs ? ?/sec 1.05 275.5±14.39µs ? ?/sec
busy_systems/05x_entities_09_systems 1.27 514.7±20.64µs ? ?/sec 1.01 410.9±12.45µs ? ?/sec 1.00 406.5±12.22µs ? ?/sec
busy_systems/05x_entities_12_systems 1.20 659.0±22.17µs ? ?/sec 1.00 547.0±13.86µs ? ?/sec 1.01 550.9±17.99µs ? ?/sec
busy_systems/05x_entities_15_systems 1.27 838.4±30.31µs ? ?/sec 1.00 659.7±18.72µs ? ?/sec 1.05 693.9±49.48µs ? ?/sec
contrived/01x_entities_03_systems 1.21 27.2±0.52µs ? ?/sec 1.00 22.5±1.62µs ? ?/sec 1.00 22.5±1.44µs ? ?/sec
contrived/01x_entities_06_systems 1.25 52.8±1.32µs ? ?/sec 1.00 42.4±2.02µs ? ?/sec 1.03 43.7±1.75µs ? ?/sec
contrived/01x_entities_09_systems 1.21 75.1±2.46µs ? ?/sec 1.00 62.0±2.42µs ? ?/sec 1.01 62.4±2.97µs ? ?/sec
contrived/01x_entities_12_systems 1.21 98.9±1.88µs ? ?/sec 1.00 81.6±4.21µs ? ?/sec 1.01 82.5±5.39µs ? ?/sec
contrived/01x_entities_15_systems 1.25 125.8±3.28µs ? ?/sec 1.00 100.6±6.48µs ? ?/sec 1.05 106.0±6.78µs ? ?/sec
contrived/02x_entities_03_systems 1.37 45.4±1.72µs ? ?/sec 1.00 33.1±2.56µs ? ?/sec 1.09 36.1±1.85µs ? ?/sec
contrived/02x_entities_06_systems 1.21 79.0±1.47µs ? ?/sec 1.00 65.2±4.73µs ? ?/sec 1.05 68.2±2.51µs ? ?/sec
contrived/02x_entities_09_systems 1.22 116.3±2.60µs ? ?/sec 1.00 95.1±4.00µs ? ?/sec 1.02 97.4±4.48µs ? ?/sec
contrived/02x_entities_12_systems 1.22 152.8±4.98µs ? ?/sec 1.00 125.2±3.68µs ? ?/sec 1.02 128.2±3.54µs ? ?/sec
contrived/02x_entities_15_systems 1.25 186.0±3.24µs ? ?/sec 1.01 150.7±7.38µs ? ?/sec 1.00 149.1±6.81µs ? ?/sec
contrived/03x_entities_03_systems 1.31 54.8±1.61µs ? ?/sec 1.00 41.9±2.10µs ? ?/sec 1.05 44.2±2.75µs ? ?/sec
contrived/03x_entities_06_systems 1.28 105.3±2.34µs ? ?/sec 1.00 82.0±2.44µs ? ?/sec 1.00 82.3±3.33µs ? ?/sec
contrived/03x_entities_09_systems 1.25 157.9±3.23µs ? ?/sec 1.00 126.7±5.21µs ? ?/sec 1.01 128.4±6.95µs ? ?/sec
contrived/03x_entities_12_systems 1.17 196.6±5.02µs ? ?/sec 1.00 167.6±5.79µs ? ?/sec 1.02 171.2±6.28µs ? ?/sec
contrived/03x_entities_15_systems 1.13 238.8±5.80µs ? ?/sec 1.00 212.2±10.71µs ? ?/sec 1.03 219.6±6.77µs ? ?/sec
contrived/04x_entities_03_systems 1.39 73.1±2.13µs ? ?/sec 1.00 52.7±2.80µs ? ?/sec 1.24 65.3±3.78µs ? ?/sec
contrived/04x_entities_06_systems 1.31 133.8±3.18µs ? ?/sec 1.10 111.8±8.56µs ? ?/sec 1.00 101.9±3.49µs ? ?/sec
contrived/04x_entities_09_systems 1.23 189.2±5.33µs ? ?/sec 1.00 154.0±4.34µs ? ?/sec 1.01 154.8±5.08µs ? ?/sec
contrived/04x_entities_12_systems 1.18 241.6±5.61µs ? ?/sec 1.01 206.2±8.28µs ? ?/sec 1.00 204.4±6.92µs ? ?/sec
contrived/04x_entities_15_systems 1.12 295.4±7.79µs ? ?/sec 1.00 262.9±9.39µs ? ?/sec 1.04 272.1±9.51µs ? ?/sec
contrived/05x_entities_03_systems 1.38 84.0±2.42µs ? ?/sec 1.00 60.9±1.98µs ? ?/sec 1.18 71.8±4.78µs ? ?/sec
contrived/05x_entities_06_systems 1.32 159.1±3.13µs ? ?/sec 1.00 120.1±2.42µs ? ?/sec 1.11 132.9±6.89µs ? ?/sec
contrived/05x_entities_09_systems 1.32 241.3±5.63µs ? ?/sec 1.00 182.5±5.08µs ? ?/sec 1.12 205.1±11.71µs ? ?/sec
contrived/05x_entities_12_systems 1.20 299.3±12.07µs ? ?/sec 1.00 248.8±10.79µs ? ?/sec 1.06 263.8±6.77µs ? ?/sec
contrived/05x_entities_15_systems 1.18 362.3±8.39µs ? ?/sec 1.00 306.5±16.78µs ? ?/sec 1.06 324.8±13.37µs ? ?/se
fragmented_iter/base 1.00 352.2±10.92ns ? ?/sec 1.34 470.4±31.26ns ? ?/sec 1.00 350.6±12.09ns ? ?/sec
fragmented_iter/foreach 1.08 245.8±26.37ns ? ?/sec 1.07 242.8±23.09ns ? ?/sec 1.00 226.7±18.01ns ? ?/sec
fragmented_iter/foreach_wide 1.01 4.0±0.23µs ? ?/sec 1.03 4.1±0.54µs ? ?/sec 1.00 4.0±0.23µs ? ?/sec
fragmented_iter/wide 1.00 4.5±0.19µs ? ?/sec 1.34 6.1±0.21µs ? ?/sec 1.18 5.3±0.15µs ? ?/sec
query_get/50000_entities_sparse 1.00 643.1±45.04µs ? ?/sec 1.99 1280.2±34.58µs ? ?/sec 1.04 670.6±11.62µs ? ?/sec
query_get/50000_entities_table 1.26 577.9±10.12µs ? ?/sec 1.60 737.7±12.53µs ? ?/sec 1.00 460.1±34.59µs ? ?/sec
query_get_component/50000_entities_sparse 1.06 1225.8±95.93µs ? ?/sec 1.04 1209.1±65.14µs ? ?/sec 1.00 1159.8±29.57µs ? ?/sec
query_get_component/50000_entities_table 1.04 1244.6±112.45µs ? ?/sec 1.00 1192.1±17.73µs ? ?/sec 1.00 1195.1±34.28µs ? ?/sec
simple_iter/base 1.00 11.0±0.06µs ? ?/sec 1.25 13.7±0.10µs ? ?/sec 1.00 10.9±0.07µs ? ?/sec
simple_iter/foreach 1.01 10.9±0.06µs ? ?/sec 1.00 10.8±0.10µs ? ?/sec 1.01 10.9±0.04µs ? ?/sec
simple_iter/foreach_wide 1.00 44.2±0.66µs ? ?/sec 1.10 48.4±4.06µs ? ?/sec 1.00 44.1±0.38µs ? ?/sec
simple_iter/sparse 1.00 47.8±0.44µs ? ?/sec 1.15 54.8±0.58µs ? ?/sec 1.00 47.6±1.33µs ? ?/sec
simple_iter/sparse_foreach 1.03 43.7±2.89µs ? ?/sec 1.16 49.1±0.33µs ? ?/sec 1.00 42.3±0.45µs ? ?/sec
simple_iter/sparse_foreach_wide 1.06 241.9±3.14µs ? ?/sec 1.15 262.1±4.61µs ? ?/sec 1.00 228.9±2.71µs ? ?/sec
simple_iter/sparse_wide 1.04 252.6±2.16µs ? ?/sec 1.16 281.6±4.89µs ? ?/sec 1.00 243.6±2.95µs ? ?/sec
simple_iter/system 1.00 11.0±0.13µs ? ?/sec 1.25 13.7±0.15µs ? ?/sec 1.00 11.0±0.06µs ? ?/sec
simple_iter/wide 1.09 55.5±0.22µs ? ?/sec 1.26 64.3±0.52µs ? ?/sec 1.00 51.0±1.06µs ? ?/sec
sized_commands_0_bytes/2000_commands 1.13 5.1±0.08µs ? ?/sec 1.00 4.5±0.03µs ? ?/sec 1.24 5.6±0.09µs ? ?/sec
sized_commands_0_bytes/4000_commands 1.11 10.2±0.09µs ? ?/sec 1.00 9.1±0.06µs ? ?/sec 1.23 11.2±0.07µs ? ?/sec
sized_commands_0_bytes/6000_commands 1.13 15.4±0.20µs ? ?/sec 1.00 13.7±0.11µs ? ?/sec 1.24 16.9±0.63µs ? ?/sec
sized_commands_0_bytes/8000_commands 1.12 20.4±0.28µs ? ?/sec 1.00 18.3±0.22µs ? ?/sec 1.23 22.5±0.10µs ? ?/sec
sized_commands_12_bytes/2000_commands 1.00 7.1±0.06µs ? ?/sec 1.00 7.1±0.04µs ? ?/sec 1.01 7.2±0.02µs ? ?/sec
sized_commands_12_bytes/4000_commands 1.01 14.4±0.07µs ? ?/sec 1.02 14.6±0.23µs ? ?/sec 1.00 14.3±0.09µs ? ?/sec
sized_commands_12_bytes/6000_commands 1.01 22.1±0.33µs ? ?/sec 1.00 21.8±0.31µs ? ?/sec 1.02 22.2±0.27µs ? ?/sec
sized_commands_12_bytes/8000_commands 1.00 28.9±0.40µs ? ?/sec 1.00 28.9±0.44µs ? ?/sec 1.00 28.9±0.16µs ? ?/sec
sized_commands_512_bytes/2000_commands 1.06 108.9±2.50µs ? ?/sec 1.08 110.8±2.61µs ? ?/sec 1.00 102.7±2.75µs ? ?/sec
sized_commands_512_bytes/4000_commands 1.07 224.9±23.23µs ? ?/sec 1.07 226.1±12.77µs ? ?/sec 1.00 210.4±15.70µs ? ?/sec
sized_commands_512_bytes/6000_commands 1.07 344.3±44.37µs ? ?/sec 1.08 347.7±45.95µs ? ?/sec 1.00 322.5±36.53µs ? ?/sec
sized_commands_512_bytes/8000_commands 1.08 471.0±74.12µs ? ?/sec 1.07 466.3±58.84µs ? ?/sec 1.00 436.6±70.12µs ? ?/sec
sparse_fragmented_iter/base 1.16 11.7±1.03ns ? ?/sec 1.14 11.6±0.57ns ? ?/sec 1.00 10.2±0.52ns ? ?/sec
sparse_fragmented_iter/foreach 1.03 9.0±0.32ns ? ?/sec 1.05 9.2±0.24ns ? ?/sec 1.00 8.7±0.32ns ? ?/sec
sparse_fragmented_iter/foreach_wide 1.00 42.1±1.16ns ? ?/sec 1.06 44.5±5.25ns ? ?/sec 1.06 44.7±17.05ns ? ?/sec
sparse_fragmented_iter/wide 1.12 72.8±5.68ns ? ?/sec 1.32 85.9±5.85ns ? ?/sec 1.00 65.1±4.36ns ? ?/sec
world_entity/50000_entities 1.00 425.2±2.51µs ? ?/sec 1.01 427.5±0.97µs ? ?/sec 1.00 424.3±0.68µs ? ?/sec
world_get/50000_entities_sparse 1.02 562.1±9.75µs ? ?/sec 1.05 578.7±5.37µs ? ?/sec 1.00 550.8±18.57µs ? ?/sec
world_get/50000_entities_table 1.05 911.2±15.13µs ? ?/sec 1.08 943.0±8.83µs ? ?/sec 1.00 869.4±2.27µs ? ?/sec
world_query_for_each/50000_entities_sparse 1.00 84.5±0.66µs ? ?/sec 1.00 84.1±2.73µs ? ?/sec 1.14 96.1±2.58µs ? ?/sec
world_query_for_each/50000_entities_table 1.00 27.2±0.14µs ? ?/sec 1.00 27.2±0.13µs ? ?/sec 1.00 27.2±0.12µs ? ?/sec
world_query_get/50000_entities_sparse 1.00 464.8±8.55µs ? ?/sec 1.04 483.1±4.63µs ? ?/sec 1.01 468.4±23.38µs ? ?/sec
world_query_get/50000_entities_sparse_wide 1.06 1420.2±53.48µs ? ?/sec 1.14 1518.6±28.91µs ? ?/sec 1.00 1336.0±13.60µs ? ?/sec
world_query_get/50000_entities_table 1.54 402.2±3.20µs ? ?/sec 1.68 437.5±4.37µs ? ?/sec 1.00 260.9±7.64µs ? ?/sec
world_query_get/50000_entities_table_wide 1.09 811.5±6.02µs ? ?/sec 1.08 804.2±8.17µs ? ?/sec 1.00 745.8±33.71µs ? ?/sec
world_query_iter/50000_entities_sparse 1.02 97.9±8.14µs ? ?/sec 1.01 96.6±1.41µs ? ?/sec 1.00 95.8±0.83µs ? ?/sec
world_query_iter/50000_entities_table 1.00 27.2±0.45µs ? ?/sec 1.01 27.5±0.42µs ? ?/sec 1.00 27.3±0.81µs ? ?/sec
I've given these changes a pretty thorough review (and I'm on board). I just merged #5205, so if you adapt to the flatter format / resolves conflicts I'll merge this in short order.
Another sanity check microbenchmark to ensure nothing since the rebase has affected the perf changes seen earlier:
group fetch-cleanup main
----- ------------- ----
add_remove/sparse_set 1.08 1268.9±87.23µs ? ?/sec 1.00 1176.5±66.08µs ? ?/sec
add_remove/table 1.06 1579.0±11.37µs ? ?/sec 1.00 1488.3±9.01µs ? ?/sec
add_remove_big/sparse_set 1.00 1454.4±304.11µs ? ?/sec 1.00 1454.3±254.56µs ? ?/sec
add_remove_big/table 1.00 2.8±0.05ms ? ?/sec 1.02 2.8±0.03ms ? ?/sec
added_archetypes/archetype_count/100 1.00 133.0±5.77µs ? ?/sec 1.08 144.1±6.04µs ? ?/sec
added_archetypes/archetype_count/1000 1.00 685.6±30.33µs ? ?/sec 1.07 737.0±54.32µs ? ?/sec
added_archetypes/archetype_count/10000 1.00 11.1±0.99ms ? ?/sec 1.09 12.0±1.47ms ? ?/sec
added_archetypes/archetype_count/200 1.00 202.5±8.23µs ? ?/sec 1.04 210.4±8.46µs ? ?/sec
added_archetypes/archetype_count/2000 1.00 1301.8±13.39µs ? ?/sec 1.04 1353.8±38.92µs ? ?/sec
added_archetypes/archetype_count/500 1.00 414.2±19.10µs ? ?/sec 1.08 446.7±46.44µs ? ?/sec
added_archetypes/archetype_count/5000 1.00 3.6±0.23ms ? ?/sec 1.03 3.7±0.29ms ? ?/sec
busy_systems/01x_entities_03_systems 1.00 32.3±1.66µs ? ?/sec 1.03 33.3±1.11µs ? ?/sec
busy_systems/01x_entities_06_systems 1.00 63.9±1.63µs ? ?/sec 1.20 76.9±5.06µs ? ?/sec
busy_systems/01x_entities_09_systems 1.00 94.7±2.91µs ? ?/sec 1.02 96.8±2.30µs ? ?/sec
busy_systems/01x_entities_12_systems 1.00 128.1±6.51µs ? ?/sec 1.00 128.2±4.45µs ? ?/sec
busy_systems/01x_entities_15_systems 1.00 165.3±11.14µs ? ?/sec 1.02 168.6±5.11µs ? ?/sec
busy_systems/02x_entities_03_systems 1.01 62.6±2.75µs ? ?/sec 1.00 61.9±3.22µs ? ?/sec
busy_systems/02x_entities_06_systems 1.00 110.7±2.89µs ? ?/sec 1.08 119.2±5.08µs ? ?/sec
busy_systems/02x_entities_09_systems 1.00 172.9±8.28µs ? ?/sec 1.13 196.1±13.30µs ? ?/sec
busy_systems/02x_entities_12_systems 1.00 233.6±13.24µs ? ?/sec 1.04 241.8±6.45µs ? ?/sec
busy_systems/02x_entities_15_systems 1.00 292.1±7.97µs ? ?/sec 1.14 333.2±28.27µs ? ?/sec
busy_systems/03x_entities_03_systems 1.00 85.3±6.08µs ? ?/sec 1.10 93.5±5.50µs ? ?/sec
busy_systems/03x_entities_06_systems 1.08 188.8±20.73µs ? ?/sec 1.00 175.1±8.34µs ? ?/sec
busy_systems/03x_entities_09_systems 1.00 244.2±8.30µs ? ?/sec 1.04 255.1±6.97µs ? ?/sec
busy_systems/03x_entities_12_systems 1.00 316.7±6.50µs ? ?/sec 1.14 361.2±23.45µs ? ?/sec
busy_systems/03x_entities_15_systems 1.00 390.8±9.45µs ? ?/sec 1.09 427.5±24.81µs ? ?/sec
busy_systems/04x_entities_03_systems 1.00 103.6±2.70µs ? ?/sec 1.07 111.0±6.01µs ? ?/sec
busy_systems/04x_entities_06_systems 1.05 217.9±8.40µs ? ?/sec 1.00 206.7±3.61µs ? ?/sec
busy_systems/04x_entities_09_systems 1.00 322.9±8.09µs ? ?/sec 1.02 328.4±13.50µs ? ?/sec
busy_systems/04x_entities_12_systems 1.00 421.1±10.97µs ? ?/sec 1.04 437.5±16.77µs ? ?/sec
busy_systems/04x_entities_15_systems 1.00 538.2±26.41µs ? ?/sec 1.03 554.1±17.64µs ? ?/sec
busy_systems/05x_entities_03_systems 1.00 152.0±5.04µs ? ?/sec 1.01 153.1±13.03µs ? ?/sec
busy_systems/05x_entities_06_systems 1.06 313.4±13.09µs ? ?/sec 1.00 296.0±14.11µs ? ?/sec
busy_systems/05x_entities_09_systems 1.16 501.8±26.77µs ? ?/sec 1.00 430.8±15.88µs ? ?/sec
busy_systems/05x_entities_12_systems 1.15 638.7±34.35µs ? ?/sec 1.00 554.3±14.60µs ? ?/sec
busy_systems/05x_entities_15_systems 1.08 751.6±19.56µs ? ?/sec 1.00 696.6±30.10µs ? ?/sec
contrived/01x_entities_03_systems 1.12 20.9±0.76µs ? ?/sec 1.00 18.6±1.04µs ? ?/sec
contrived/01x_entities_06_systems 1.00 36.9±1.71µs ? ?/sec 1.12 41.1±3.16µs ? ?/sec
contrived/01x_entities_09_systems 1.00 57.7±2.12µs ? ?/sec 1.07 61.8±4.89µs ? ?/sec
contrived/01x_entities_12_systems 1.00 77.7±3.41µs ? ?/sec 1.00 77.9±4.83µs ? ?/sec
contrived/01x_entities_15_systems 1.03 95.8±4.90µs ? ?/sec 1.00 93.1±6.13µs ? ?/sec
contrived/02x_entities_03_systems 1.01 37.6±2.77µs ? ?/sec 1.00 37.4±3.59µs ? ?/sec
contrived/02x_entities_06_systems 1.01 65.4±3.02µs ? ?/sec 1.00 64.9±4.52µs ? ?/sec
contrived/02x_entities_09_systems 1.00 90.4±2.87µs ? ?/sec 1.05 94.8±5.57µs ? ?/sec
contrived/02x_entities_12_systems 1.07 120.9±7.75µs ? ?/sec 1.00 112.8±1.95µs ? ?/sec
contrived/02x_entities_15_systems 1.00 149.8±9.17µs ? ?/sec 1.03 153.8±9.61µs ? ?/sec
contrived/03x_entities_03_systems 1.00 41.6±2.99µs ? ?/sec 1.12 46.6±4.03µs ? ?/sec
contrived/03x_entities_06_systems 1.06 93.9±8.80µs ? ?/sec 1.00 88.5±5.52µs ? ?/sec
contrived/03x_entities_09_systems 1.00 119.9±5.72µs ? ?/sec 1.06 127.5±5.72µs ? ?/sec
contrived/03x_entities_12_systems 1.02 175.7±9.57µs ? ?/sec 1.00 172.0±8.75µs ? ?/sec
contrived/03x_entities_15_systems 1.00 194.6±7.28µs ? ?/sec 1.10 213.2±11.33µs ? ?/sec
contrived/04x_entities_03_systems 1.00 49.3±1.82µs ? ?/sec 1.11 54.8±3.03µs ? ?/sec
contrived/04x_entities_06_systems 1.00 106.0±9.04µs ? ?/sec 1.02 107.8±5.70µs ? ?/sec
contrived/04x_entities_09_systems 1.00 154.2±9.32µs ? ?/sec 1.06 162.9±10.30µs ? ?/sec
contrived/04x_entities_12_systems 1.00 201.2±9.26µs ? ?/sec 1.05 212.1±10.46µs ? ?/sec
contrived/04x_entities_15_systems 1.00 249.2±9.01µs ? ?/sec 1.07 266.0±13.07µs ? ?/sec
contrived/05x_entities_03_systems 1.00 59.9±1.77µs ? ?/sec 1.11 66.8±5.35µs ? ?/sec
contrived/05x_entities_06_systems 1.00 130.6±11.41µs ? ?/sec 1.08 141.2±12.92µs ? ?/sec
contrived/05x_entities_09_systems 1.00 194.0±14.56µs ? ?/sec 1.08 208.7±19.26µs ? ?/sec
contrived/05x_entities_12_systems 1.00 252.7±16.08µs ? ?/sec 1.01 256.3±9.52µs ? ?/sec
contrived/05x_entities_15_systems 1.00 306.5±14.65µs ? ?/sec 1.08 330.4±16.47µs ? ?/sec
get_or_spawn/batched 1.00 411.2±20.19µs ? ?/sec 1.01 415.6±19.61µs ? ?/sec
get_or_spawn/individual 1.01 925.7±72.21µs ? ?/sec 1.00 915.3±89.30µs ? ?/sec
heavy_compute/base 1.03 360.7±3.97µs ? ?/sec 1.00 351.3±1.89µs ? ?/sec
insert_commands/insert 1.00 802.9±34.72µs ? ?/sec 1.04 832.4±76.71µs ? ?/sec
insert_commands/insert_batch 1.02 416.6±46.70µs ? ?/sec 1.00 407.8±18.01µs ? ?/sec
insert_simple/base 1.00 554.1±1.94µs ? ?/sec 1.03 570.7±3.33µs ? ?/sec
insert_simple/unbatched 1.00 1207.9±31.83µs ? ?/sec 1.03 1242.6±16.31µs ? ?/sec
iter_fragmented/base 1.00 344.4±5.64ns ? ?/sec 1.39 477.6±25.28ns ? ?/sec
iter_fragmented/foreach 1.01 246.2±26.31ns ? ?/sec 1.00 243.4±23.85ns ? ?/sec
iter_fragmented/foreach_wide 1.01 3.9±0.24µs ? ?/sec 1.00 3.9±0.10µs ? ?/sec
iter_fragmented/wide 1.00 4.5±0.22µs ? ?/sec 1.16 5.3±0.15µs ? ?/sec
iter_fragmented_sparse/base 1.00 10.6±0.60ns ? ?/sec 1.09 11.6±0.98ns ? ?/sec
iter_fragmented_sparse/foreach 1.00 9.0±0.23ns ? ?/sec 1.15 10.3±0.68ns ? ?/sec
iter_fragmented_sparse/foreach_wide 1.00 43.1±7.11ns ? ?/sec 1.04 45.1±10.53ns ? ?/sec
iter_fragmented_sparse/wide 1.00 55.3±16.08ns ? ?/sec 1.21 66.9±0.51ns ? ?/sec
iter_simple/base 1.00 11.0±0.05µs ? ?/sec 1.25 13.7±0.11µs ? ?/sec
iter_simple/foreach 1.01 10.9±0.03µs ? ?/sec 1.00 10.8±0.04µs ? ?/sec
iter_simple/foreach_sparse_set 1.00 42.4±0.24µs ? ?/sec 1.13 47.9±0.21µs ? ?/sec
iter_simple/foreach_wide 1.00 45.7±1.21µs ? ?/sec 1.09 50.0±2.51µs ? ?/sec
iter_simple/foreach_wide_sparse_set 1.00 230.9±1.44µs ? ?/sec 1.14 264.0±1.42µs ? ?/sec
iter_simple/sparse_set 1.00 49.4±0.16µs ? ?/sec 1.12 55.1±0.25µs ? ?/sec
iter_simple/system 1.00 11.0±0.02µs ? ?/sec 1.24 13.6±0.04µs ? ?/sec
iter_simple/wide 1.00 59.8±0.67µs ? ?/sec 1.12 66.9±0.50µs ? ?/sec
iter_simple/wide_sparse_set 1.00 232.9±0.87µs ? ?/sec 1.19 277.0±1.00µs ? ?/sec
query_get/50000_entities_sparse 1.00 717.5±19.04µs ? ?/sec 2.01 1440.5±31.43µs ? ?/sec
query_get/50000_entities_table 1.00 491.9±4.48µs ? ?/sec 1.51 741.8±25.19µs ? ?/sec
query_get_component/50000_entities_sparse 1.00 1163.5±24.56µs ? ?/sec 1.01 1174.2±32.50µs ? ?/sec
query_get_component/50000_entities_table 1.02 1087.6±11.38µs ? ?/sec 1.00 1069.5±24.82µs ? ?/sec
query_get_component_simple/system 1.00 752.6±4.44µs ? ?/sec 1.04 786.1±6.48µs ? ?/sec
query_get_component_simple/unchecked 1.00 974.7±9.31µs ? ?/sec 1.03 1000.7±51.16µs ? ?/sec
run_criteria/no/001_systems 1.00 93.4±0.45ns ? ?/sec 1.02 95.0±0.25ns ? ?/sec
run_criteria/no/006_systems 1.03 175.1±1.06ns ? ?/sec 1.00 169.6±0.82ns ? ?/sec
run_criteria/no/011_systems 1.02 259.5±0.67ns ? ?/sec 1.00 254.8±1.08ns ? ?/sec
run_criteria/no/016_systems 1.02 337.8±0.84ns ? ?/sec 1.00 331.6±1.39ns ? ?/sec
run_criteria/no/021_systems 1.05 429.1±2.19ns ? ?/sec 1.00 410.2±1.76ns ? ?/sec
run_criteria/no/026_systems 1.03 506.0±2.04ns ? ?/sec 1.00 488.9±2.74ns ? ?/sec
run_criteria/no/031_systems 1.06 605.8±3.28ns ? ?/sec 1.00 573.4±2.03ns ? ?/sec
run_criteria/no/036_systems 1.03 708.6±2.56ns ? ?/sec 1.00 685.7±1.56ns ? ?/sec
run_criteria/no/041_systems 1.03 785.3±4.47ns ? ?/sec 1.00 764.5±1.32ns ? ?/sec
run_criteria/no/046_systems 1.07 927.1±10.20ns ? ?/sec 1.00 868.2±2.36ns ? ?/sec
run_criteria/no/051_systems 1.06 1018.8±10.58ns ? ?/sec 1.00 962.7±1.84ns ? ?/sec
run_criteria/no/056_systems 1.10 1141.4±6.27ns ? ?/sec 1.00 1040.1±5.40ns ? ?/sec
run_criteria/no/061_systems 1.10 1259.0±8.06ns ? ?/sec 1.00 1144.3±1.38ns ? ?/sec
run_criteria/no/066_systems 1.08 1337.4±11.57ns ? ?/sec 1.00 1237.5±3.75ns ? ?/sec
run_criteria/no/071_systems 1.00 1412.3±11.45ns ? ?/sec 1.00 1409.2±3.30ns ? ?/sec
run_criteria/no/076_systems 1.04 1490.5±15.69ns ? ?/sec 1.00 1426.7±2.75ns ? ?/sec
run_criteria/no/081_systems 1.03 1572.9±11.18ns ? ?/sec 1.00 1524.3±15.76ns ? ?/sec
run_criteria/no/086_systems 1.02 1645.0±15.40ns ? ?/sec 1.00 1608.3±2.82ns ? ?/sec
run_criteria/no/091_systems 1.05 1766.6±26.71ns ? ?/sec 1.00 1690.2±2.33ns ? ?/sec
run_criteria/no/096_systems 1.03 1824.9±24.77ns ? ?/sec 1.00 1773.4±3.61ns ? ?/sec
run_criteria/no/101_systems 1.06 1958.2±18.92ns ? ?/sec 1.00 1847.7±6.74ns ? ?/sec
run_criteria/no_with_labels/001_systems 1.00 91.0±0.49ns ? ?/sec 1.00 91.1±0.19ns ? ?/sec
run_criteria/no_with_labels/006_systems 1.09 161.3±1.04ns ? ?/sec 1.00 148.6±1.28ns ? ?/sec
run_criteria/no_with_labels/011_systems 1.09 227.6±2.47ns ? ?/sec 1.00 208.2±1.64ns ? ?/sec
run_criteria/no_with_labels/016_systems 1.07 282.4±1.93ns ? ?/sec 1.00 264.5±0.97ns ? ?/sec
run_criteria/no_with_labels/021_systems 1.09 344.6±1.79ns ? ?/sec 1.00 317.4±1.06ns ? ?/sec
run_criteria/no_with_labels/026_systems 1.09 410.8±3.93ns ? ?/sec 1.00 378.5±2.32ns ? ?/sec
run_criteria/no_with_labels/031_systems 1.07 475.1±4.42ns ? ?/sec 1.00 444.0±3.90ns ? ?/sec
run_criteria/no_with_labels/036_systems 1.10 558.2±2.91ns ? ?/sec 1.00 506.1±1.93ns ? ?/sec
run_criteria/no_with_labels/041_systems 1.07 600.8±2.73ns ? ?/sec 1.00 561.3±1.96ns ? ?/sec
run_criteria/no_with_labels/046_systems 1.08 667.6±9.71ns ? ?/sec 1.00 619.7±6.54ns ? ?/sec
run_criteria/no_with_labels/051_systems 1.08 728.5±6.24ns ? ?/sec 1.00 671.9±6.71ns ? ?/sec
run_criteria/no_with_labels/056_systems 1.11 804.1±7.67ns ? ?/sec 1.00 727.5±3.71ns ? ?/sec
run_criteria/no_with_labels/061_systems 1.11 871.1±8.84ns ? ?/sec 1.00 786.9±2.64ns ? ?/sec
run_criteria/no_with_labels/066_systems 1.08 925.2±5.77ns ? ?/sec 1.00 860.1±2.39ns ? ?/sec
run_criteria/no_with_labels/071_systems 1.07 995.8±12.16ns ? ?/sec 1.00 930.2±6.50ns ? ?/sec
run_criteria/no_with_labels/076_systems 1.07 1057.9±6.51ns ? ?/sec 1.00 986.2±9.76ns ? ?/sec
run_criteria/no_with_labels/081_systems 1.11 1156.6±7.87ns ? ?/sec 1.00 1046.6±23.81ns ? ?/sec
run_criteria/no_with_labels/086_systems 1.11 1219.4±8.10ns ? ?/sec 1.00 1103.3±6.99ns ? ?/sec
run_criteria/no_with_labels/091_systems 1.11 1279.2±4.54ns ? ?/sec 1.00 1148.5±5.92ns ? ?/sec
run_criteria/no_with_labels/096_systems 1.12 1344.0±30.91ns ? ?/sec 1.00 1198.0±4.55ns ? ?/sec
run_criteria/no_with_labels/101_systems 1.09 1391.9±65.04ns ? ?/sec 1.00 1273.2±4.81ns ? ?/sec
run_criteria/yes/001_systems 1.00 4.8±0.11µs ? ?/sec 1.08 5.2±0.04µs ? ?/sec
run_criteria/yes/006_systems 1.00 9.1±0.10µs ? ?/sec 1.12 10.2±0.14µs ? ?/sec
run_criteria/yes/011_systems 1.00 13.5±1.22µs ? ?/sec 1.07 14.4±0.86µs ? ?/sec
run_criteria/yes/016_systems 1.00 17.8±0.90µs ? ?/sec 1.05 18.7±1.22µs ? ?/sec
run_criteria/yes/021_systems 1.00 20.8±1.32µs ? ?/sec 1.10 23.0±1.46µs ? ?/sec
run_criteria/yes/026_systems 1.00 24.1±1.78µs ? ?/sec 1.07 25.9±1.27µs ? ?/sec
run_criteria/yes/031_systems 1.00 26.7±1.39µs ? ?/sec 1.11 29.6±1.57µs ? ?/sec
run_criteria/yes/036_systems 1.00 30.1±1.85µs ? ?/sec 1.04 31.4±1.79µs ? ?/sec
run_criteria/yes/041_systems 1.00 34.3±1.31µs ? ?/sec 1.04 35.8±2.16µs ? ?/sec
run_criteria/yes/046_systems 1.00 36.9±1.45µs ? ?/sec 1.07 39.5±2.97µs ? ?/sec
run_criteria/yes/051_systems 1.00 40.4±2.12µs ? ?/sec 1.08 43.7±1.72µs ? ?/sec
run_criteria/yes/056_systems 1.00 43.3±1.46µs ? ?/sec 1.07 46.2±1.67µs ? ?/sec
run_criteria/yes/061_systems 1.00 46.5±2.16µs ? ?/sec 1.02 47.5±2.39µs ? ?/sec
run_criteria/yes/066_systems 1.00 48.4±2.68µs ? ?/sec 1.08 52.3±2.00µs ? ?/sec
run_criteria/yes/071_systems 1.00 53.6±4.02µs ? ?/sec 1.04 56.0±1.99µs ? ?/sec
run_criteria/yes/076_systems 1.00 56.1±2.53µs ? ?/sec 1.06 59.4±2.35µs ? ?/sec
run_criteria/yes/081_systems 1.00 60.1±3.47µs ? ?/sec 1.03 62.0±2.33µs ? ?/sec
run_criteria/yes/086_systems 1.00 63.1±3.02µs ? ?/sec 1.07 67.8±2.47µs ? ?/sec
run_criteria/yes/091_systems 1.00 67.4±2.30µs ? ?/sec 1.09 73.3±2.98µs ? ?/sec
run_criteria/yes/096_systems 1.00 74.3±3.02µs ? ?/sec 1.06 79.0±2.76µs ? ?/sec
run_criteria/yes/101_systems 1.00 79.2±3.62µs ? ?/sec 1.08 85.2±3.33µs ? ?/sec
run_criteria/yes_using_query/001_systems 1.00 4.7±0.12µs ? ?/sec 1.03 4.8±0.27µs ? ?/sec
run_criteria/yes_using_query/006_systems 1.00 8.9±0.15µs ? ?/sec 1.13 10.1±0.15µs ? ?/sec
run_criteria/yes_using_query/011_systems 1.00 13.6±0.56µs ? ?/sec 1.11 15.1±0.44µs ? ?/sec
run_criteria/yes_using_query/016_systems 1.00 17.8±0.65µs ? ?/sec 1.10 19.5±1.13µs ? ?/sec
run_criteria/yes_using_query/021_systems 1.00 21.6±1.19µs ? ?/sec 1.10 23.7±1.54µs ? ?/sec
run_criteria/yes_using_query/026_systems 1.00 24.8±1.14µs ? ?/sec 1.07 26.5±1.95µs ? ?/sec
run_criteria/yes_using_query/031_systems 1.00 28.2±1.16µs ? ?/sec 1.05 29.5±1.74µs ? ?/sec
run_criteria/yes_using_query/036_systems 1.00 31.6±1.48µs ? ?/sec 1.06 33.5±2.08µs ? ?/sec
run_criteria/yes_using_query/041_systems 1.00 34.1±1.46µs ? ?/sec 1.03 35.3±2.08µs ? ?/sec
run_criteria/yes_using_query/046_systems 1.00 38.2±1.81µs ? ?/sec 1.05 40.2±2.76µs ? ?/sec
run_criteria/yes_using_query/051_systems 1.00 41.0±1.54µs ? ?/sec 1.00 41.1±2.83µs ? ?/sec
run_criteria/yes_using_query/056_systems 1.00 44.1±1.77µs ? ?/sec 1.02 45.1±2.23µs ? ?/sec
run_criteria/yes_using_query/061_systems 1.00 47.3±2.22µs ? ?/sec 1.02 48.1±2.82µs ? ?/sec
run_criteria/yes_using_query/066_systems 1.00 49.6±2.23µs ? ?/sec 1.03 51.1±2.02µs ? ?/sec
run_criteria/yes_using_query/071_systems 1.04 55.9±2.61µs ? ?/sec 1.00 53.6±2.70µs ? ?/sec
run_criteria/yes_using_query/076_systems 1.00 58.1±2.40µs ? ?/sec 1.00 58.4±3.93µs ? ?/sec
run_criteria/yes_using_query/081_systems 1.00 64.6±2.78µs ? ?/sec 1.03 66.3±2.45µs ? ?/sec
run_criteria/yes_using_query/086_systems 1.00 67.7±3.38µs ? ?/sec 1.04 70.4±2.84µs ? ?/sec
run_criteria/yes_using_query/091_systems 1.00 70.8±2.75µs ? ?/sec 1.07 75.6±4.41µs ? ?/sec
run_criteria/yes_using_query/096_systems 1.00 76.5±3.67µs ? ?/sec 1.05 80.0±4.03µs ? ?/sec
run_criteria/yes_using_query/101_systems 1.00 82.8±2.71µs ? ?/sec 1.06 87.4±3.20µs ? ?/sec
run_criteria/yes_using_resource/001_systems 1.00 4.7±0.12µs ? ?/sec 1.01 4.7±0.24µs ? ?/sec
run_criteria/yes_using_resource/006_systems 1.00 9.0±0.27µs ? ?/sec 1.09 9.8±0.19µs ? ?/sec
run_criteria/yes_using_resource/011_systems 1.00 14.0±0.50µs ? ?/sec 1.07 15.0±0.76µs ? ?/sec
run_criteria/yes_using_resource/016_systems 1.00 18.7±1.30µs ? ?/sec 1.02 19.0±1.09µs ? ?/sec
run_criteria/yes_using_resource/021_systems 1.00 21.7±1.23µs ? ?/sec 1.05 22.7±1.29µs ? ?/sec
run_criteria/yes_using_resource/026_systems 1.00 26.3±1.81µs ? ?/sec 1.01 26.7±1.00µs ? ?/sec
run_criteria/yes_using_resource/031_systems 1.00 27.8±1.44µs ? ?/sec 1.07 29.8±1.15µs ? ?/sec
run_criteria/yes_using_resource/036_systems 1.00 31.2±2.00µs ? ?/sec 1.06 33.1±1.41µs ? ?/sec
run_criteria/yes_using_resource/041_systems 1.00 34.6±2.49µs ? ?/sec 1.09 37.5±1.72µs ? ?/sec
run_criteria/yes_using_resource/046_systems 1.00 37.1±1.88µs ? ?/sec 1.10 40.8±1.43µs ? ?/sec
run_criteria/yes_using_resource/051_systems 1.00 41.3±2.98µs ? ?/sec 1.10 45.4±2.14µs ? ?/sec
run_criteria/yes_using_resource/056_systems 1.00 44.6±3.48µs ? ?/sec 1.11 49.6±2.84µs ? ?/sec
run_criteria/yes_using_resource/061_systems 1.00 47.4±3.13µs ? ?/sec 1.09 51.8±2.21µs ? ?/sec
run_criteria/yes_using_resource/066_systems 1.00 50.5±2.81µs ? ?/sec 1.11 56.3±3.21µs ? ?/sec
run_criteria/yes_using_resource/071_systems 1.00 53.2±2.89µs ? ?/sec 1.12 59.6±2.23µs ? ?/sec
run_criteria/yes_using_resource/076_systems 1.00 55.7±2.77µs ? ?/sec 1.11 61.6±2.58µs ? ?/sec
run_criteria/yes_using_resource/081_systems 1.00 57.0±2.42µs ? ?/sec 1.11 63.6±2.15µs ? ?/sec
run_criteria/yes_using_resource/086_systems 1.00 66.9±3.37µs ? ?/sec 1.02 68.3±3.61µs ? ?/sec
run_criteria/yes_using_resource/091_systems 1.00 70.2±3.03µs ? ?/sec 1.06 74.1±3.17µs ? ?/sec
run_criteria/yes_using_resource/096_systems 1.00 75.6±4.54µs ? ?/sec 1.09 82.4±4.85µs ? ?/sec
run_criteria/yes_using_resource/101_systems 1.00 79.0±3.44µs ? ?/sec 1.14 90.2±3.63µs ? ?/sec
sized_commands_0_bytes/2000_commands 1.00 4.5±0.03µs ? ?/sec 1.12 5.1±0.01µs ? ?/sec
sized_commands_0_bytes/4000_commands 1.00 9.1±0.04µs ? ?/sec 1.11 10.2±0.04µs ? ?/sec
sized_commands_0_bytes/6000_commands 1.00 13.7±0.07µs ? ?/sec 1.12 15.3±0.05µs ? ?/sec
sized_commands_0_bytes/8000_commands 1.00 18.3±0.08µs ? ?/sec 1.11 20.3±0.06µs ? ?/sec
sized_commands_12_bytes/2000_commands 1.00 7.1±0.05µs ? ?/sec 1.03 7.3±0.02µs ? ?/sec
sized_commands_12_bytes/4000_commands 1.00 14.5±0.10µs ? ?/sec 1.01 14.6±0.05µs ? ?/sec
sized_commands_12_bytes/6000_commands 1.00 21.9±0.18µs ? ?/sec 1.01 22.1±0.09µs ? ?/sec
sized_commands_12_bytes/8000_commands 1.00 29.3±0.24µs ? ?/sec 1.00 29.4±0.17µs ? ?/sec
sized_commands_512_bytes/2000_commands 1.00 103.0±3.57µs ? ?/sec 1.08 110.7±3.50µs ? ?/sec
sized_commands_512_bytes/4000_commands 1.00 212.1±16.07µs ? ?/sec 1.06 224.7±14.64µs ? ?/sec
sized_commands_512_bytes/6000_commands 1.00 321.8±35.43µs ? ?/sec 1.07 345.2±38.65µs ? ?/sec
sized_commands_512_bytes/8000_commands 1.00 434.8±61.40µs ? ?/sec 1.07 466.5±67.14µs ? ?/sec
spawn_commands/2000_entities 1.00 227.5±9.97µs ? ?/sec 1.03 234.2±6.67µs ? ?/sec
spawn_commands/4000_entities 1.00 457.1±18.53µs ? ?/sec 1.03 472.1±12.59µs ? ?/sec
spawn_commands/6000_entities 1.00 713.0±24.84µs ? ?/sec 1.01 721.5±27.41µs ? ?/sec
spawn_commands/8000_entities 1.00 918.9±30.62µs ? ?/sec 1.02 939.5±24.01µs ? ?/sec
spawn_world/10000_entities 1.00 1218.2±92.80µs ? ?/sec 1.00 1215.1±78.94µs ? ?/sec
spawn_world/1000_entities 1.00 121.7±9.06µs ? ?/sec 1.01 122.4±8.85µs ? ?/sec
spawn_world/100_entities 1.00 12.3±0.89µs ? ?/sec 1.00 12.3±0.91µs ? ?/sec
spawn_world/10_entities 1.00 1211.1±96.87ns ? ?/sec 1.01 1218.9±98.09ns ? ?/sec
spawn_world/1_entities 1.01 123.1±9.41ns ? ?/sec 1.00 122.1±8.85ns ? ?/sec
world_entity/50000_entities 1.00 424.7±1.31µs ? ?/sec 1.00 424.2±0.90µs ? ?/sec
world_get/50000_entities_sparse 1.03 580.2±3.38µs ? ?/sec 1.00 561.1±2.13µs ? ?/sec
world_get/50000_entities_table 1.02 938.6±13.42µs ? ?/sec 1.00 916.6±6.70µs ? ?/sec
world_query_for_each/50000_entities_sparse 1.01 84.6±0.64µs ? ?/sec 1.00 83.6±0.23µs ? ?/sec
world_query_for_each/50000_entities_table 1.01 27.3±0.11µs ? ?/sec 1.00 27.1±0.05µs ? ?/sec
world_query_get/50000_entities_sparse 1.00 457.7±4.91µs ? ?/sec 1.02 464.6±1.64µs ? ?/sec
world_query_get/50000_entities_sparse_wide 1.00 1413.0±8.89µs ? ?/sec 1.09 1536.3±69.11µs ? ?/sec
world_query_get/50000_entities_table 1.00 260.7±1.41µs ? ?/sec 1.62 422.1±1.06µs ? ?/sec
world_query_get/50000_entities_table_wide 1.00 806.9±3.47µs ? ?/sec 1.05 844.3±22.03µs ? ?/sec
world_query_iter/50000_entities_sparse 1.02 97.5±0.60µs ? ?/sec 1.00 95.6±1.67µs ? ?/sec
world_query_iter/50000_entities_table 1.01 27.4±0.09µs ? ?/sec 1.00 27.1±0.07µs ? ?/sec
My trait query benchmarks look promising when I update to this branch.
| All<> - 1 match | 66.371 µs | +5.2565% |
|---|---|---|
| All<> - 2 matches | 99.637 µs | -6.4798% |
| All<> - 1-2 matches | 85.095 µs | +4.4459% |
| One<> | 28.772 µs | -11.916% |
| One<> - filtering | 15.342 µs | -10.749% |
bors r+
Pull request successfully merged into main.
Build succeeded:
- build-and-install-on-iOS
- build-android
- build (macos-latest)
- build (ubuntu-latest)
- build-wasm
- build (windows-latest)
- build-without-default-features (bevy)
- build-without-default-features (bevy_ecs)
- build-without-default-features (bevy_reflect)
- check-compiles
- check-doc
- check-missing-examples-in-docs
- ci
- markdownlint
- run-examples
- run-examples-on-wasm
- run-examples-on-windows-dx12