kphp
kphp copied to clipboard
[compiler] use perfect hashes for switches with small strings
Previously, we compiled a string switch as follow:
1. compute a tag (condition) string hash
2. switch over precomputed hashes of cases
3. inside a switch case, do `equals()` for a string
So, we compile a switch as a 2-level comparison: hash matching
plus equals()
call on a string.
For shorter strings we can avoid the third step by using a hash that will not result in collisions and identifies a string completely.
Since we're using uint64_t for switch hashes, we have 8 bits for information.
We can store up to 8 chars inside that without any losses.
Our hash function will be more or less memcpy(&h, s, s_len)
.
Hash of a single char becomes a value in 0 <= x <= 255
range, which
is great for C++ switch density: it's likely to get that switch compiled
as a jump table (useful for lexers, etc).
Another common case is a shared prefix of all case values.
case 'OPERATION_DELETE'
case 'OPERATION_INSERT'
case 'OPERATION_UPDATE'
All cases above have a length of 16 that is too much for our new hashing.
But if we remove the common OPERATION_
prefix, strings fit into 6 bytes.
When we have such switch with common prefixes and varying suffixes that
fit into 8 bytes, we compare the prefix before hashing, if it does match,
variable cases tail is hashed (DELETE
, INSERT
, UPDATE
from above).
Since optimized form doesn't require extra if statement inside every case, the binary size gets smaller as well (but not dramatically).
In average, we compile 60% of string switches using the optimized scheme. (~50% for short strings plus around ~10% for prefix form)
Benchmark results:
name old time/op new time/op delta
SwitchOnlyDefault 70.0ns ± 0% 29.0ns ± 0% -58.57%
Lexer 119ns ± 1% 78ns ± 2% -35.04%
LexerMiss 87.0ns ± 0% 77.0ns ± 0% -11.49%
LexerComplex 115ns ± 1% 93ns ± 2% -19.43%
LexerComplexMiss 87.0ns ± 0% 74.6ns ± 1% -14.25%
Switch8oneChar 158ns ± 0% 95ns ± 1% -39.81%
Switch8oneCharMiss 108ns ± 1% 102ns ± 1% -5.48%
Switch8oneCharNoDefault 164ns ± 1% 87ns ± 1% -46.67%
Switch8oneCharNoDefaultMiss 117ns ± 0% 97ns ± 1% -17.35%
Switch6perfectHash 258ns ± 1% 204ns ± 1% -21.19%
Switch6perfectHashMiss 130ns ± 0% 119ns ± 1% -8.21%
Switch6perfectHashNoDefault 271ns ± 1% 193ns ± 1% -28.72%
Switch6perfectHashNoDefaultMiss 138ns ± 1% 115ns ± 1% -16.58%
Switch12perfectHash 278ns ± 0% 194ns ± 4% -30.19%
Switch12perfectHashMiss 139ns ± 1% 109ns ± 1% -21.62%
Switch12perfectHashNoDefault 275ns ± 1% 190ns ± 0% -30.74%
Switch12perfectHashNoDefaultMiss 142ns ± 1% 108ns ± 0% -24.05%
Switch6perfectHashWithPrefix 286ns ± 0% 219ns ± 3% -23.40%
Switch6perfectHashWithPrefixMiss 190ns ± 1% 143ns ± 2% -24.67%
Switch6perfectHashWithPrefixNoDefault 293ns ± 0% 230ns ± 1% -21.58%
Switch6perfectHashWithPrefixNoDefaultMiss 194ns ± 1% 153ns ± 2% -20.82%
Switch12perfectHashWithPrefix 303ns ± 1% 225ns ± 2% -25.69%
Switch12perfectHashWithPrefixMiss 159ns ± 0% 113ns ± 0% -28.93%
Switch12perfectHashWithPrefixNoDefault 308ns ± 1% 224ns ± 1% -27.18%
Switch12perfectHashWithPrefixNoDefaultMiss 141ns ± 1% 117ns ± 2% -17.14%
[Geo mean] 171ns 130ns -23.93%
Another change is that we now compile a default-only switch a little bit differently. We don't compute a hash and discard auto-generated variables to avoid C++ compiler warnings.
Refs #418 Fixes #503