go-benchmarks
go-benchmarks copied to clipboard
hashing/fnv: avoid allocating
The previous implementation unnecessarily allocated on each invocation of the hash function. Using a single initialization removes that allocation and significantly speeds up the hash for small inputs.
name old time/op new time/op delta
FNV1/1-8 20.4ns ± 1% 8.6ns ± 2% -57.92% (p=0.008 n=5+5)
FNV1/2-8 21.1ns ± 6% 9.5ns ± 1% -54.96% (p=0.008 n=5+5)
FNV1/4-8 22.3ns ± 2% 10.3ns ± 3% -53.76% (p=0.008 n=5+5)
FNV1/8-8 25.9ns ± 7% 12.4ns ± 0% -52.20% (p=0.008 n=5+5)
FNV1/32-8 46.7ns ± 1% 33.7ns ± 1% -27.88% (p=0.008 n=5+5)
FNV1/64-8 80.1ns ± 2% 63.7ns ± 1% -20.43% (p=0.008 n=5+5)
FNV1/128-8 146ns ± 1% 129ns ± 1% -11.51% (p=0.008 n=5+5)
FNV1/256-8 282ns ± 3% 264ns ± 2% -6.38% (p=0.008 n=5+5)
FNV1/512-8 547ns ± 1% 527ns ± 0% -3.55% (p=0.008 n=5+5)
FNV1/1024-8 1.07µs ± 1% 1.06µs ± 1% -1.38% (p=0.016 n=5+5)
FNV1/4096-8 4.26µs ± 2% 4.23µs ± 1% ~ (p=0.206 n=5+5)
FNV1/8192-8 8.48µs ± 0% 8.44µs ± 0% -0.42% (p=0.016 n=5+5)
name old speed new speed delta
FNV1/1-8 49.1MB/s ± 1% 116.6MB/s ± 2% +137.66% (p=0.008 n=5+5)
FNV1/2-8 94.9MB/s ± 5% 210.7MB/s ± 1% +121.91% (p=0.008 n=5+5)
FNV1/4-8 179MB/s ± 2% 388MB/s ± 2% +116.62% (p=0.008 n=5+5)
FNV1/8-8 310MB/s ± 7% 647MB/s ± 1% +108.41% (p=0.008 n=5+5)
FNV1/32-8 685MB/s ± 1% 950MB/s ± 1% +38.65% (p=0.008 n=5+5)
FNV1/64-8 800MB/s ± 2% 1004MB/s ± 1% +25.58% (p=0.008 n=5+5)
FNV1/128-8 872MB/s ± 2% 987MB/s ± 1% +13.13% (p=0.008 n=5+5)
FNV1/256-8 906MB/s ± 4% 967MB/s ± 2% +6.74% (p=0.008 n=5+5)
FNV1/512-8 935MB/s ± 1% 970MB/s ± 1% +3.74% (p=0.008 n=5+5)
FNV1/1024-8 953MB/s ± 1% 967MB/s ± 1% +1.42% (p=0.008 n=5+5)
FNV1/4096-8 961MB/s ± 1% 968MB/s ± 1% ~ (p=0.222 n=5+5)
FNV1/8192-8 966MB/s ± 0% 970MB/s ± 0% +0.43% (p=0.016 n=5+5)
@kellabyte Also, if you avoid the use of the hash interface from the standard lib and use an implementation of FNV that has the appropriate signature FNV1a(b []byte) uint64
you get another ~40% speed boost for small inputs. I can submit that as well if interested. FNV is one of the faster hashes for very small inputs. Its main drawback is working a byte at a time, which slows it down considerably for large inputs.