hashbrown
hashbrown copied to clipboard
Skip `match_empty` after failed to find a insert slot in group
trafficstars
Running benchmark on my local macbook pro(M1 Pro):
Before:
Before:
running 41 tests
test clone_from_large ... bench: 7,630.54 ns/iter (+/- 105.82)
test clone_from_small ... bench: 69.27 ns/iter (+/- 4.44)
test clone_large ... bench: 7,650.92 ns/iter (+/- 138.79)
test clone_small ... bench: 110.63 ns/iter (+/- 4.04)
test grow_insert_foldhash_highbits ... bench: 21,677.08 ns/iter (+/- 1,188.93)
test grow_insert_foldhash_random ... bench: 24,624.66 ns/iter (+/- 673.04)
test grow_insert_foldhash_serial ... bench: 22,303.47 ns/iter (+/- 1,107.11)
test grow_insert_std_highbits ... bench: 37,461.35 ns/iter (+/- 1,110.84)
test grow_insert_std_random ... bench: 37,744.47 ns/iter (+/- 1,511.93)
test grow_insert_std_serial ... bench: 37,440.62 ns/iter (+/- 618.89)
test insert_erase_foldhash_highbits ... bench: 15,511.28 ns/iter (+/- 367.55)
test insert_erase_foldhash_random ... bench: 17,365.08 ns/iter (+/- 1,125.38)
test insert_erase_foldhash_serial ... bench: 15,461.06 ns/iter (+/- 738.03)
test insert_erase_std_highbits ... bench: 31,643.75 ns/iter (+/- 1,053.08)
test insert_erase_std_random ... bench: 33,238.85 ns/iter (+/- 1,722.66)
test insert_erase_std_serial ... bench: 31,921.90 ns/iter (+/- 1,340.85)
test insert_foldhash_highbits ... bench: 14,680.70 ns/iter (+/- 545.13)
test insert_foldhash_random ... bench: 14,706.28 ns/iter (+/- 290.97)
test insert_foldhash_serial ... bench: 14,810.04 ns/iter (+/- 362.32)
test insert_std_highbits ... bench: 19,833.44 ns/iter (+/- 786.51)
test insert_std_random ... bench: 19,728.32 ns/iter (+/- 277.17)
test insert_std_serial ... bench: 19,674.22 ns/iter (+/- 636.33)
test iter_foldhash_highbits ... bench: 902.91 ns/iter (+/- 66.87)
test iter_foldhash_random ... bench: 784.51 ns/iter (+/- 29.84)
test iter_foldhash_serial ... bench: 777.75 ns/iter (+/- 49.07)
test iter_std_highbits ... bench: 762.85 ns/iter (+/- 14.56)
test iter_std_random ... bench: 774.44 ns/iter (+/- 24.61)
test iter_std_serial ... bench: 762.94 ns/iter (+/- 12.79)
test lookup_fail_foldhash_highbits ... bench: 1,899.07 ns/iter (+/- 67.99)
test lookup_fail_foldhash_random ... bench: 2,249.00 ns/iter (+/- 129.37)
test lookup_fail_foldhash_serial ... bench: 2,602.64 ns/iter (+/- 56.42)
test lookup_fail_std_highbits ... bench: 8,734.81 ns/iter (+/- 273.01)
test lookup_fail_std_random ... bench: 8,594.13 ns/iter (+/- 507.16)
test lookup_fail_std_serial ... bench: 8,625.80 ns/iter (+/- 145.82)
test lookup_foldhash_highbits ... bench: 2,098.07 ns/iter (+/- 49.03)
test lookup_foldhash_random ... bench: 2,455.68 ns/iter (+/- 87.79)
test lookup_foldhash_serial ... bench: 2,089.30 ns/iter (+/- 111.22)
test lookup_std_highbits ... bench: 8,784.86 ns/iter (+/- 293.86)
test lookup_std_random ... bench: 8,740.02 ns/iter (+/- 119.12)
test lookup_std_serial ... bench: 8,604.05 ns/iter (+/- 472.75)
test rehash_in_place ... bench: 209,355.86 ns/iter (+/- 6,211.48)
test result: ok. 0 passed; 0 failed; 0 ignored; 41 measured; 0 filtered out; finished in 55.76s
Running benches/insert_unique_unchecked.rs (target/release/deps/insert_unique_unchecked-70b56273fe907210)
running 2 tests
test insert ... bench: 9,978.20 ns/iter (+/- 218.31)
test insert_unique_unchecked ... bench: 3,453.39 ns/iter (+/- 53.06)
test result: ok. 0 passed; 0 failed; 0 ignored; 2 measured; 0 filtered out; finished in 2.10s
Running benches/set_ops.rs (target/release/deps/set_ops-41e9dbeede3fedd5)
running 10 tests
test set_ops_bit_and ... bench: 10,482.35 ns/iter (+/- 387.65)
test set_ops_bit_and_assign ... bench: 6,106.08 ns/iter (+/- 247.88)
test set_ops_bit_or ... bench: 86,940.28 ns/iter (+/- 2,392.71)
test set_ops_bit_or_assign ... bench: 53,571.67 ns/iter (+/- 1,050.82)
test set_ops_bit_xor ... bench: 88,900.81 ns/iter (+/- 3,737.91)
test set_ops_bit_xor_assign ... bench: 55,736.53 ns/iter (+/- 1,031.37)
test set_ops_sub_assign_large_small ... bench: 55,513.89 ns/iter (+/- 1,965.35)
test set_ops_sub_assign_small_large ... bench: 6,100.57 ns/iter (+/- 255.82)
test set_ops_sub_large_small ... bench: 87,790.68 ns/iter (+/- 2,284.42)
test set_ops_sub_small_large ... bench: 647.65 ns/iter (+/- 5.28)
test result: ok. 0 passed; 0 failed; 0 ignored; 10 measured; 0 filtered out; finished in 19.65s
After:
running 41 tests
test clone_from_large ... bench: 7,708.37 ns/iter (+/- 317.98)
test clone_from_small ... bench: 69.86 ns/iter (+/- 1.66)
test clone_large ... bench: 7,662.61 ns/iter (+/- 153.91)
test clone_small ... bench: 110.04 ns/iter (+/- 3.01)
test grow_insert_foldhash_highbits ... bench: 20,067.66 ns/iter (+/- 813.59)
test grow_insert_foldhash_random ... bench: 24,074.16 ns/iter (+/- 1,134.52)
test grow_insert_foldhash_serial ... bench: 21,096.95 ns/iter (+/- 366.22)
test grow_insert_std_highbits ... bench: 36,934.55 ns/iter (+/- 894.28)
test grow_insert_std_random ... bench: 36,898.65 ns/iter (+/- 1,328.19)
test grow_insert_std_serial ... bench: 36,811.75 ns/iter (+/- 1,086.38)
test insert_erase_foldhash_highbits ... bench: 15,791.87 ns/iter (+/- 473.97)
test insert_erase_foldhash_random ... bench: 16,973.07 ns/iter (+/- 934.27)
test insert_erase_foldhash_serial ... bench: 17,217.81 ns/iter (+/- 1,168.29)
test insert_erase_std_highbits ... bench: 33,779.54 ns/iter (+/- 1,145.91)
test insert_erase_std_random ... bench: 34,539.58 ns/iter (+/- 743.19)
test insert_erase_std_serial ... bench: 33,910.30 ns/iter (+/- 5,716.08)
test insert_foldhash_highbits ... bench: 14,719.39 ns/iter (+/- 467.13)
test insert_foldhash_random ... bench: 15,091.88 ns/iter (+/- 12,584.39)
test insert_foldhash_serial ... bench: 14,592.95 ns/iter (+/- 652.87)
test insert_std_highbits ... bench: 19,681.62 ns/iter (+/- 728.00)
test insert_std_random ... bench: 19,588.85 ns/iter (+/- 618.96)
test insert_std_serial ... bench: 19,611.61 ns/iter (+/- 792.86)
test iter_foldhash_highbits ... bench: 828.32 ns/iter (+/- 27.00)
test iter_foldhash_random ... bench: 768.69 ns/iter (+/- 55.53)
test iter_foldhash_serial ... bench: 766.07 ns/iter (+/- 9.61)
test iter_std_highbits ... bench: 765.31 ns/iter (+/- 18.42)
test iter_std_random ... bench: 765.99 ns/iter (+/- 45.18)
test iter_std_serial ... bench: 763.18 ns/iter (+/- 15.77)
test lookup_fail_foldhash_highbits ... bench: 1,936.33 ns/iter (+/- 47.30)
test lookup_fail_foldhash_random ... bench: 2,425.87 ns/iter (+/- 99.09)
test lookup_fail_foldhash_serial ... bench: 2,127.12 ns/iter (+/- 67.47)
test lookup_fail_std_highbits ... bench: 8,581.58 ns/iter (+/- 279.54)
test lookup_fail_std_random ... bench: 8,333.45 ns/iter (+/- 114.58)
test lookup_fail_std_serial ... bench: 8,097.86 ns/iter (+/- 185.79)
test lookup_foldhash_highbits ... bench: 2,172.82 ns/iter (+/- 43.31)
test lookup_foldhash_random ... bench: 2,370.67 ns/iter (+/- 43.23)
test lookup_foldhash_serial ... bench: 2,066.38 ns/iter (+/- 113.00)
test lookup_std_highbits ... bench: 8,777.33 ns/iter (+/- 266.92)
test lookup_std_random ... bench: 8,688.57 ns/iter (+/- 277.06)
test lookup_std_serial ... bench: 8,676.47 ns/iter (+/- 196.41)
test rehash_in_place ... bench: 191,902.86 ns/iter (+/- 3,842.01)
test result: ok. 0 passed; 0 failed; 0 ignored; 41 measured; 0 filtered out; finished in 50.84s
Running benches/insert_unique_unchecked.rs (target/release/deps/insert_unique_unchecked-70b56273fe907210)
running 2 tests
test insert ... bench: 4,811.63 ns/iter (+/- 36.77)
test insert_unique_unchecked ... bench: 3,464.63 ns/iter (+/- 109.34)
test result: ok. 0 passed; 0 failed; 0 ignored; 2 measured; 0 filtered out; finished in 1.11s
Running benches/set_ops.rs (target/release/deps/set_ops-41e9dbeede3fedd5)
running 10 tests
test set_ops_bit_and ... bench: 8,704.29 ns/iter (+/- 204.07)
test set_ops_bit_and_assign ... bench: 6,075.31 ns/iter (+/- 137.78)
test set_ops_bit_or ... bench: 68,336.41 ns/iter (+/- 2,494.32)
test set_ops_bit_or_assign ... bench: 53,319.49 ns/iter (+/- 1,172.32)
test set_ops_bit_xor ... bench: 73,222.35 ns/iter (+/- 3,848.50)
test set_ops_bit_xor_assign ... bench: 55,755.47 ns/iter (+/- 2,032.17)
test set_ops_sub_assign_large_small ... bench: 55,242.62 ns/iter (+/- 1,864.93)
test set_ops_sub_assign_small_large ... bench: 6,130.91 ns/iter (+/- 130.80)
test set_ops_sub_large_small ... bench: 73,938.60 ns/iter (+/- 2,211.95)
test set_ops_sub_small_large ... bench: 626.02 ns/iter (+/- 19.25)
Highlights:
- test insert ... bench: 9,978.20 ns/iter (+/- 218.31)
+ test insert ... bench: 4,811.63 ns/iter (+/- 36.77)
- test set_ops_bit_and ... bench: 10,482.35 ns/iter (+/- 387.65)
+ test set_ops_bit_and ... bench: 8,704.29 ns/iter (+/- 204.07)
- test set_ops_sub_large_small ... bench: 87,790.68 ns/iter (+/- 2,284.42)
+ test set_ops_sub_large_small ... bench: 73,938.60 ns/iter (+/- 2,211.95)
Previously, each group would be matched twice(first by find_insert_slot_in_group, which finds empty or deleted slot in the group, and then again by match_empty, which finds empty slot in the group) until a deleted slot was found in the table. The second match is clearly redundant.
By removing the redundant check, we significantly improve the insertion speed and query speed in a newly created HashMap/HashSet.