orleans
orleans copied to clipboard
Optimize hashing
Remove all allocations from deterministic hashing and use more pre-computed hash values.
Microsoft Reviewers: Open in CodeFlow
I'd like to wait until our distributed testing infrastructure is back online before merging. In the meantime, I've left some comments. Thanks, as always
I did some benchmarking with various data lengths (in bytes). JenkinsV2 is the version in this PR. Overall it looks like XxHash32 would be the best option (and it has better hashing quality).
BenchmarkDotNet=v0.13.1, OS=Windows 10.0.19044.1862 (21H2)
Intel Core i7-7700K CPU 4.20GHz (Kaby Lake), 1 CPU, 8 logical and 4 physical cores
.NET SDK=7.0.100-preview.6.22352.1
[Host] : .NET 7.0.0 (7.0.22.32404), X64 RyuJIT
DefaultJob : .NET 7.0.0 (7.0.22.32404), X64 RyuJIT
| Method | Length | Mean | Error | StdDev | Ratio |
|---|---|---|---|---|---|
| Jenkins | 3 | 23.695 ns | 0.2038 ns | 0.1906 ns | 1.00 |
| JenkinsV2 | 3 | 4.674 ns | 0.0292 ns | 0.0244 ns | 0.20 |
| XxHash32 | 3 | 6.804 ns | 0.0140 ns | 0.0131 ns | 0.29 |
| XxHash64 | 3 | 8.422 ns | 0.0302 ns | 0.0252 ns | 0.36 |
| Sha256 | 3 | 439.652 ns | 2.0611 ns | 1.7211 ns | 18.57 |
| Jenkins | 24 | 66.817 ns | 0.0416 ns | 0.0389 ns | 1.00 |
| JenkinsV2 | 24 | 13.317 ns | 0.0043 ns | 0.0036 ns | 0.20 |
| XxHash32 | 24 | 10.965 ns | 0.0134 ns | 0.0112 ns | 0.16 |
| XxHash64 | 24 | 10.157 ns | 0.0070 ns | 0.0066 ns | 0.15 |
| Sha256 | 24 | 437.698 ns | 3.3567 ns | 2.9757 ns | 6.55 |
| Jenkins | 100 | 203.097 ns | 0.1447 ns | 0.1283 ns | 1.00 |
| JenkinsV2 | 100 | 42.779 ns | 0.1976 ns | 0.1848 ns | 0.21 |
| XxHash32 | 100 | 24.373 ns | 0.0102 ns | 0.0091 ns | 0.12 |
| XxHash64 | 100 | 20.962 ns | 0.0245 ns | 0.0229 ns | 0.10 |
| Sha256 | 100 | 695.510 ns | 4.3684 ns | 4.0862 ns | 3.42 |
| Jenkins | 1024 | 1,941.947 ns | 1.2229 ns | 1.1439 ns | 1.00 |
| JenkinsV2 | 1024 | 414.290 ns | 0.3464 ns | 0.2705 ns | 0.21 |
| XxHash32 | 1024 | 196.953 ns | 0.1136 ns | 0.0949 ns | 0.10 |
| XxHash64 | 1024 | 123.778 ns | 0.1209 ns | 0.1131 ns | 0.06 |
| Sha256 | 1024 | 4,192.216 ns | 13.2682 ns | 12.4111 ns | 2.16 |
I agree, the hashing quality is a significant plus. I was looking for a reference I had from last time I looked into this. I think this is it: https://github.com/rurban/smhasher xxHash64 has no known quality issues (xxHash32 does) and Jenkins is known-bad.
xxHash64 is a bit faster for inputs as small as 24, but I'm not sure if the quality stays the same if we truncate the result from 64-bits to 32-bits?
I think it's fine to use a mix of xxHash32 and xxHash64 depending on the case. We can use xxHash32 for the 32bit cases (GetConsistentHashCode, etc)