haxmap
haxmap copied to clipboard
now we can use xxh3 to speed up
https://github.com/zeebo/xxh3
// your custom hash function func customStringHasher(s string) uintptr { return uintptr(xxh3.HashString(s)) }
@welldonexing thanks for the update, xxh3 is indeed faster and I am also planning to make a switch
on another note how does your xxh3 library compare with Cyan4973/xxHash performance wise for xxh3 algo only ?
yes ,by my simple test , it 's almost the same to old one ,but that 's almost the only one xxh3 writen in go .
const sizeOfUintPtr = unsafe.Sizeof(uintptr(0))
func uintptrToBytes(u *uintptr) []byte {
return (*[sizeOfUintPtr]byte)(unsafe.Pointer(u))[:]
}
func customStringHasher(s uintptr) uintptr {
return uintptr(xxh3.Hash(uintptrToBytes(&s)))
}
func setupHaxMapWithXX3() *haxmap.Map[uintptr, uintptr] {
m := haxmap.New[uintptr, uintptr](mapSize)
m.SetHasher(customStringHasher)
for i := uintptr(0); i < epochs; i++ {
m.Set(i, i)
}
return m
}
the question is https://github.com/zeebo/xxh3/blob/master/hash64.go use the str type func hashAny(s str) (acc u64)
maybe faster when directly using pointer.or []byte
nice, I will change to xxh3 hashing and publish a new release possibly this weekend
PS:- I think xxh3 will give a considerable speed boost for long string key types but for int, uint, float etc there shouldn't be any difference
xxhash3 outperforms xxhash in string hashing, according to benchmark results on my machine.
goos: darwin
goarch: arm64
pkg: github.com/alphadose/haxmap/benchmarks
BenchmarkXXHASH_sting16bytes
BenchmarkXXHASH_sting16bytes-10 100000 6.527 ns/op
BenchmarkXXHASH_sting128bytes
BenchmarkXXHASH_sting128bytes-10 100000 18.11 ns/op
BenchmarkXXHASH_sting1024bytes
BenchmarkXXHASH_sting1024bytes-10 100000 109.9 ns/op
BenchmarkXXHASH_sting8192bytes
BenchmarkXXHASH_sting8192bytes-10 100000 709.1 ns/op
BenchmarkXXHASH3_string16_bytes
BenchmarkXXHASH3_string16_bytes-10 100000 3.285 ns/op
BenchmarkXXHASH3_string128_bytes
BenchmarkXXHASH3_string128_bytes-10 100000 12.12 ns/op
BenchmarkXXHASH3_string1024_bytes
BenchmarkXXHASH3_string1024_bytes-10 100000 107.6 ns/op
BenchmarkXXHASH3_string8192_bytes
BenchmarkXXHASH3_string8192_bytes-10 100000 588.2 ns/op
PASS
ok github.com/alphadose/haxmap/benchmarks 25.658s
why don't you want to use the FNV family of algorithms? There is an implementation in the standard library. FNV64 is 4-5 times faster than xxhash.
prime := 17861029589083791777
h := 14695981039346656037
for _, b := range bytes {
h ^= uint64(b)
h *= prime
}
return h
And that's it
@mvestnik can you run a benchmark test with the FNV hash algorithm similar to this one https://github.com/alphadose/haxmap/blob/main/benchmarks/map_test.go ?
You can use haxmap.SetHasher()
function to override the default xxhash algo with your FNV algo
also if possible randomize the input dataset so that there is less bias in the final benchmark results
I wrote a simple benchmark. You can check the results:
package experiment
import (
"testing"
"github.com/cespare/xxhash/v2"
)
const (
prime uint64 = 17861029589083791777
offsetBasis uint64 = 14695981039346656037
)
func fnv64(bytes []byte) uint64 {
h := offsetBasis
for _, b := range bytes {
h ^= uint64(b)
h *= prime
}
return h
}
func xxh64(bytes []byte) uint64 {
return xxhash.Sum64(bytes)
}
func BenchmarkFNV64(b *testing.B) {
for i := 0; i < b.N; i++ {
fnv64([]byte{102, 105, 120, 116, 117, 114, 101})
}
}
func BenchmarkXXH64(b *testing.B) {
for i := 0; i < b.N; i++ {
xxh64([]byte{102, 105, 120, 116, 117, 114, 101})
}
}
I guess xxh3
performs in all tested scenarios of direct hashing and when using the digest interface. (5 bytes, 100 bytes, 4KB, 10MB). it provies higher throughput than xxhash
.
Also, xxh3
is generally faster than FNV-1a
, especially as the data size increases. (in my pc env)
goos: windows
goarch: amd64
cpu: Intel(R) Core(TM) i5-10600KF CPU @ 4.10GHz
BenchmarkHashes/xxhash,direct,bytes,n=5B-12 236713702 5.118 ns/op 976.90 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxhash,direct,string,n=5B-12 200201168 5.836 ns/op 856.78 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxhash,digest,bytes,n=5B-12 82036696 15.08 ns/op 331.52 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxhash,digest,string,n=5B-12 72117117 18.36 ns/op 272.27 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxhash,direct,bytes,n=100B-12 85273906 13.88 ns/op 7202.97 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxhash,direct,string,n=100B-12 76765117 17.48 ns/op 5721.49 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxhash,digest,bytes,n=100B-12 58091688 20.21 ns/op 4949.07 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxhash,digest,string,n=100B-12 55638498 20.50 ns/op 4876.92 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxhash,direct,bytes,n=4KB-12 5077185 235.3 ns/op 17002.58 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxhash,direct,string,n=4KB-12 5169603 234.8 ns/op 17037.33 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxhash,digest,bytes,n=4KB-12 5079498 242.7 ns/op 16482.49 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxhash,digest,string,n=4KB-12 4893588 243.6 ns/op 16419.50 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxhash,direct,bytes,n=10MB-12 1532 753191 ns/op 13276.84 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxhash,direct,string,n=10MB-12 1604 740705 ns/op 13500.65 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxhash,digest,bytes,n=10MB-12 1546 760969 ns/op 13141.15 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxhash,digest,string,n=10MB-12 1522 753009 ns/op 13280.05 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxh3,direct,bytes,n=5B-12 266165270 4.479 ns/op 1116.33 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxh3,direct,string,n=5B-12 277091206 4.251 ns/op 1176.26 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxh3,digest,bytes,n=5B-12 38009204 32.61 ns/op 153.32 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxh3,digest,string,n=5B-12 37305080 32.37 ns/op 154.48 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxh3,direct,bytes,n=100B-12 120515226 9.917 ns/op 10083.44 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxh3,direct,string,n=100B-12 124008230 9.624 ns/op 10390.52 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxh3,digest,bytes,n=100B-12 32545372 37.36 ns/op 2676.88 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxh3,digest,string,n=100B-12 31345341 45.12 ns/op 2216.53 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxh3,direct,bytes,n=4KB-12 14153563 80.95 ns/op 49411.66 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxh3,direct,string,n=4KB-12 13895498 104.2 ns/op 38392.30 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxh3,digest,bytes,n=4KB-12 9207777 131.1 ns/op 30504.39 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxh3,digest,string,n=4KB-12 8794700 126.3 ns/op 31662.59 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxh3,direct,bytes,n=10MB-12 3490 335840 ns/op 29776.06 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxh3,direct,string,n=10MB-12 3570 348551 ns/op 28690.17 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxh3,digest,bytes,n=10MB-12 2556 471610 ns/op 21203.94 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxh3,digest,string,n=10MB-12 2581 458128 ns/op 21827.96 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/CRC-32,direct,bytes,n=5B-12 150739432 7.740 ns/op 645.97 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/CRC-32,direct,string,n=5B-12 44208664 25.49 ns/op 196.19 MB/s 8 B/op 1 allocs/op
BenchmarkHashes/CRC-32,digest,bytes,n=5B-12 29507592 40.41 ns/op 123.74 MB/s 16 B/op 1 allocs/op
BenchmarkHashes/CRC-32,digest,string,n=5B-12 18258642 67.34 ns/op 74.26 MB/s 24 B/op 2 allocs/op
BenchmarkHashes/CRC-32,direct,bytes,n=100B-12 59111844 20.27 ns/op 4932.49 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/CRC-32,direct,string,n=100B-12 16029044 67.97 ns/op 1471.18 MB/s 112 B/op 1 allocs/op
BenchmarkHashes/CRC-32,digest,bytes,n=100B-12 20980450 57.49 ns/op 1739.40 MB/s 16 B/op 1 allocs/op
BenchmarkHashes/CRC-32,digest,string,n=100B-12 12361206 99.31 ns/op 1006.95 MB/s 128 B/op 2 allocs/op
BenchmarkHashes/CRC-32,direct,bytes,n=4KB-12 7089018 163.9 ns/op 24406.59 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/CRC-32,direct,string,n=4KB-12 1000000 1074 ns/op 3725.40 MB/s 4096 B/op 1 allocs/op
BenchmarkHashes/CRC-32,digest,bytes,n=4KB-12 5815550 206.0 ns/op 19421.88 MB/s 16 B/op 1 allocs/op
BenchmarkHashes/CRC-32,digest,string,n=4KB-12 1000000 1081 ns/op 3699.84 MB/s 4112 B/op 2 allocs/op
BenchmarkHashes/CRC-32,direct,bytes,n=10MB-12 1963 593497 ns/op 16849.29 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/CRC-32,direct,string,n=10MB-12 670 1644013 ns/op 6082.68 MB/s 10002443 B/op 1 allocs/op
BenchmarkHashes/CRC-32,digest,bytes,n=10MB-12 1858 609605 ns/op 16404.08 MB/s 16 B/op 1 allocs/op
BenchmarkHashes/CRC-32,digest,string,n=10MB-12 620 1780932 ns/op 5615.04 MB/s 10002453 B/op 2 allocs/op
BenchmarkHashes/FNV-1a,digest,bytes,n=5B-12 288622639 4.059 ns/op 1231.97 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/FNV-1a,digest,string,n=5B-12 264271659 4.472 ns/op 1118.00 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/FNV-1a,digest,bytes,n=100B-12 14979963 78.85 ns/op 1268.23 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/FNV-1a,digest,string,n=100B-12 15046964 79.35 ns/op 1260.31 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/FNV-1a,digest,bytes,n=4KB-12 328664 3575 ns/op 1118.82 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/FNV-1a,digest,string,n=4KB-12 334381 3553 ns/op 1125.76 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/FNV-1a,digest,bytes,n=10MB-12 129 9070709 ns/op 1102.45 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/FNV-1a,digest,string,n=10MB-12 130 9109688 ns/op 1097.73 MB/s 0 B/op 0 allocs/op
-- 1GB test
BenchmarkHashes/xxhash,direct,bytes,n=1GB-12 1 6710021500 ns/op 1490.31 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxhash,direct,string,n=1GB-12 1 19206542400 ns/op 520.66 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxhash,digest,bytes,n=1GB-12 1 2771892900 ns/op 3607.64 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxhash,digest,string,n=1GB-12 1 1085735900 ns/op 9210.34 MB/s 8 B/op 1 allocs/op
BenchmarkHashes/xxh3,direct,bytes,n=1GB-12 1 4397203700 ns/op 2274.17 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxh3,direct,string,n=1GB-12 1 3627537900 ns/op 2756.69 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxh3,digest,bytes,n=1GB-12 1 1581577400 ns/op 6322.80 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/xxh3,digest,string,n=1GB-12 2 1018246600 ns/op 9820.80 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/FNV-1a,digest,bytes,n=1GB-12 1 9578540900 ns/op 1044.00 MB/s 0 B/op 0 allocs/op
BenchmarkHashes/FNV-1a,digest,string,n=1GB-12 1 9733404100 ns/op 1027.39 MB/s 0 B/op 0 allocs/op
Could guys also benchmark wyhash? https://github.com/zeebo/wyhash readme says it a little faster(?) than https://github.com/zeebo/xxh3
For ref: https://bloomberg.github.io/bde/articles/wyhash.html