hashbrown icon indicating copy to clipboard operation
hashbrown copied to clipboard

Introduce Overflow & Displacement tracking.

Open matthieu-m opened this issue 1 year ago • 1 comments

Changes:

  • Introduce Overflow Trackers, with features to select the desired variant.
  • Introduce Displacements, conditional on the Overflow Tracker variant tracking removals.
  • Adjust insertion/removal of items in RawTable to properly track overflow and displacement.
  • Adjust find in RawTable to short-circuit probe sequence when overflow tracking ensure there is no need to probe further.
  • OF NOTE: enforce group alignment.

Motivation:

Overflow tracking allows cutting a probing sequence short, which may be beneficial.

The use of a multitude of variants makes it easier to test and benchmark all variants, thus making it easier to pick the right one... or not pick any.

The groups are now forcibly aligned because overflow tracking is performed on a group basis, and does not work with "floating" groups.

Design:

Overflow trackers and displacements are tacked at the end of the allocation, and their access is minimized, so that their performance impact is minimized.

In particular:

  1. An element which does not overflow on insertion need not trigger a write to any overflow tracker, nor to its displacement.
  2. Only if removals are tracked is the displacement read on removal.
  3. Only if removals are tracked and the displacement is non-0 are overflow trackers written to on removal.

This follows the philosophy of "You Don't Pay For What You Don't Use", and makes the impact as minimal as can be.

Benchmarks:

Methodology: each variant was benchmarked 3 times, and for each benchmark the best result was picked. Then all results were normalized on the current master for ease of comparison.

Benchmark master none bloom-1-u8 bloom-1-u16 counter-u8 hybrid
clone_from_large 100% (+/-19.77%) +0.00% (+/-0.20%) +0.17% (+/-0.10%) +0.00% (+/-0.20%) -0.94% (+/-0.00%) +1.18% (+/-0.18%)
clone_from_small 100% (+/-6.82%) +0.00% (+/-0.07%) +2.27% (+/-0.20%) +2.27% (+/-0.04%) +0.00% (+/-0.25%) +0.00% (+/-0.05%)
clone_large 100% (+/-8.86%) +0.00% (+/-0.09%) +1.24% (+/-0.14%) -0.66% (+/-0.07%) -0.86% (+/-0.09%) -1.04% (+/-0.07%)
clone_small 100% (+/-9.09%) +0.00% (+/-0.09%) +3.64% (+/-0.05%) +1.82% (+/-0.07%) +0.00% (+/-0.07%) +1.82% (+/-0.04%)
grow_insert_ahash_highbits 100% (+/-4.54%) +0.00% (+/-0.05%) +0.24% (+/-0.03%) -0.65% (+/-0.00%) -0.51% (+/-0.05%) +2.29% (+/-0.00%)
grow_insert_ahash_random 100% (+/-0.02%) +0.00% (+/-0.00%) +2.83% (+/-0.00%) +0.88% (+/-0.00%) +0.53% (+/-0.00%) +1.58% (+/-0.00%)
grow_insert_ahash_serial 100% (+/-0.01%) +0.00% (+/-0.00%) +0.85% (+/-0.05%) +0.22% (+/-0.00%) +1.46% (+/-0.00%) +4.13% (+/-0.00%)
grow_insert_std_highbits 100% (+/-0.00%) +0.00% (+/-0.00%) +0.81% (+/-0.00%) +1.54% (+/-0.00%) +0.14% (+/-0.00%) +0.93% (+/-0.00%)
grow_insert_std_random 100% (+/-1.61%) +0.00% (+/-0.02%) +4.05% (+/-0.00%) +2.37% (+/-0.00%) +3.96% (+/-0.00%) +3.10% (+/-0.00%)
grow_insert_std_serial 100% (+/-0.00%) +0.00% (+/-0.00%) +4.50% (+/-0.00%) +3.71% (+/-0.00%) +1.83% (+/-0.00%) +5.21% (+/-0.00%)
insert_ahash_highbits 100% (+/-0.01%) +0.00% (+/-0.00%) +2.64% (+/-0.00%) +1.21% (+/-0.00%) +2.07% (+/-0.00%) +1.45% (+/-0.00%)
insert_ahash_random 100% (+/-0.01%) +0.00% (+/-0.00%) +6.36% (+/-0.00%) +0.48% (+/-0.00%) +0.62% (+/-0.00%) +0.38% (+/-0.00%)
insert_ahash_serial 100% (+/-3.56%) +0.00% (+/-0.04%) +5.62% (+/-0.00%) +5.34% (+/-0.00%) -0.12% (+/-0.00%) +0.20% (+/-0.00%)
insert_erase_ahash_highbits 100% (+/-4.64%) +0.00% (+/-0.05%) +2.98% (+/-0.05%) +3.52% (+/-0.00%) +3.19% (+/-0.04%) +7.18% (+/-0.00%)
insert_erase_ahash_random 100% (+/-0.01%) +0.00% (+/-0.00%) +2.59% (+/-0.00%) +3.44% (+/-0.00%) +2.80% (+/-0.00%) +4.72% (+/-0.03%)
insert_erase_ahash_serial 100% (+/-0.01%) +0.00% (+/-0.00%) +0.50% (+/-0.06%) +0.83% (+/-0.00%) +5.17% (+/-0.00%) +3.54% (+/-0.02%)
insert_erase_std_highbits 100% (+/-0.01%) +0.00% (+/-0.00%) +2.06% (+/-0.00%) +2.07% (+/-0.00%) +0.14% (+/-0.00%) +0.40% (+/-0.03%)
insert_erase_std_random 100% (+/-0.01%) +0.00% (+/-0.00%) -0.06% (+/-0.00%) +0.84% (+/-0.00%) -1.83% (+/-0.00%) +0.95% (+/-0.00%)
insert_erase_std_serial 100% (+/-1.97%) +0.00% (+/-0.02%) +4.26% (+/-0.00%) +4.75% (+/-0.00%) -0.75% (+/-0.00%) +2.14% (+/-0.00%)
insert_std_highbits 100% (+/-0.00%) +0.00% (+/-0.00%) +0.35% (+/-0.00%) -0.69% (+/-0.00%) -1.61% (+/-0.04%) -1.21% (+/-0.00%)
insert_std_random 100% (+/-0.00%) +0.00% (+/-0.00%) -2.34% (+/-0.00%) -0.57% (+/-0.00%) -0.69% (+/-0.00%) +0.45% (+/-0.00%)
insert_std_serial 100% (+/-2.18%) +0.00% (+/-0.02%) -2.24% (+/-0.00%) -2.86% (+/-0.05%) +0.69% (+/-0.00%) +1.62% (+/-0.00%)
iter_ahash_highbits 100% (+/-10.23%) +0.00% (+/-0.10%) +3.41% (+/-0.12%) -1.46% (+/-0.07%) -0.32% (+/-0.11%) -0.97% (+/-0.06%)
iter_ahash_random 100% (+/-3.57%) +0.00% (+/-0.04%) +1.95% (+/-0.08%) -0.97% (+/-0.06%) -0.65% (+/-0.07%) -0.81% (+/-0.05%)
iter_ahash_serial 100% (+/-8.93%) +0.00% (+/-0.09%) +2.60% (+/-0.09%) -0.97% (+/-0.06%) -0.81% (+/-0.04%) -0.49% (+/-0.05%)
iter_std_highbits 100% (+/-4.52%) +0.00% (+/-0.05%) +2.42% (+/-0.09%) -0.48% (+/-0.06%) +0.65% (+/-0.13%) -0.16% (+/-0.06%)
iter_std_random 100% (+/-5.47%) +0.00% (+/-0.05%) -0.16% (+/-0.12%) -0.80% (+/-0.07%) +0.64% (+/-0.08%) +0.32% (+/-0.06%)
iter_std_serial 100% (+/-6.44%) +0.00% (+/-0.06%) +1.77% (+/-0.07%) +0.64% (+/-0.08%) +1.93% (+/-0.02%) +0.16% (+/-0.05%)
lookup_ahash_highbits 100% (+/-4.26%) +0.00% (+/-0.04%) +4.47% (+/-0.12%) +1.63% (+/-0.10%) -1.20% (+/-0.07%) +1.02% (+/-0.07%)
lookup_ahash_random 100% (+/-5.24%) +0.00% (+/-0.05%) +8.50% (+/-0.08%) +7.26% (+/-0.09%) -0.50% (+/-0.05%) +7.41% (+/-0.13%)
lookup_ahash_serial 100% (+/-4.51%) +0.00% (+/-0.05%) +8.28% (+/-0.05%) +6.62% (+/-0.07%) +0.25% (+/-0.14%) +8.25% (+/-0.13%)
lookup_fail_ahash_highbits 100% (+/-7.58%) +0.00% (+/-0.08%) +10.95% (+/-0.18%) +7.62% (+/-0.03%) +1.89% (+/-0.05%) +9.13% (+/-0.06%)
lookup_fail_ahash_random 100% (+/-7.33%) +0.00% (+/-0.07%) +13.83% (+/-0.16%) +9.87% (+/-0.08%) -0.34% (+/-0.05%) +12.93% (+/-0.12%)
lookup_fail_ahash_serial 100% (+/-6.37%) +0.00% (+/-0.06%) +7.33% (+/-0.05%) +11.93% (+/-0.20%) +1.36% (+/-0.06%) +10.31% (+/-0.05%)
lookup_fail_std_highbits 100% (+/-7.78%) +0.00% (+/-0.08%) +3.68% (+/-0.06%) +5.35% (+/-0.03%) +0.60% (+/-0.05%) +4.09% (+/-0.05%)
lookup_fail_std_random 100% (+/-5.59%) +0.00% (+/-0.06%) +5.37% (+/-0.11%) +6.13% (+/-0.04%) +1.06% (+/-0.00%) +5.11% (+/-0.08%)
lookup_fail_std_serial 100% (+/-4.02%) +0.00% (+/-0.04%) +1.58% (+/-0.06%) +4.38% (+/-0.11%) +0.55% (+/-0.00%) +3.10% (+/-0.05%)
lookup_std_highbits 100% (+/-3.36%) +0.00% (+/-0.03%) +5.24% (+/-0.00%) +7.26% (+/-0.00%) +1.65% (+/-0.00%) +4.80% (+/-0.09%)
lookup_std_random 100% (+/-2.47%) +0.00% (+/-0.02%) +3.76% (+/-0.03%) +3.32% (+/-0.06%) +3.57% (+/-0.11%) +3.22% (+/-0.06%)
lookup_std_serial 100% (+/-9.09%) +0.00% (+/-0.09%) +8.38% (+/-0.04%) +7.50% (+/-0.08%) +7.86% (+/-0.09%) +8.46% (+/-0.09%)
rehash_in_place 100% (+/-0.01%) +0.00% (+/-0.00%) +2.49% (+/-0.00%) -1.66% (+/-0.00%) +1.48% (+/-0.00%) +5.18% (+/-0.00%)
insert 100% (+/-0.01%) +0.00% (+/-0.00%) +0.25% (+/-0.11%) -1.51% (+/-0.07%) +4.53% (+/-0.13%) +2.96% (+/-0.00%)
insert_unique_unchecked 100% (+/-6.95%) +0.00% (+/-0.07%) -5.59% (+/-0.08%) -10.45% (+/-0.06%) -0.36% (+/-0.16%) -4.54% (+/-0.05%)

Remarks:

  • The none variant is completely neutral, which means that enforcing group alignment did not affect performance.
  • The other variants show some promise, but the results vary quite a bit depending on micro-optimization. Aggressive (always) inlining of key methods seemed to help, for example, but I am not so sure whether may_have_overflowed should be inlined since it's expected to be rare.
  • Whether the benchmark "suffer" from high probe counts is unknown to me. Overflow tracking is only helpful to cut probing sequences short, and thus pure overhead if there's no quadratic probing.

In any case, at least with the scaffolding in place it should be possible to experiment further if there's any will to.

matthieu-m avatar Apr 01 '24 13:04 matthieu-m

:umbrella: The latest upstream changes (presumably #525) made this pull request unmergeable. Please resolve the merge conflicts.

bors avatar Jun 07 '24 12:06 bors