minify icon indicating copy to clipboard operation
minify copied to clipboard

Improve perfect hashing

Open tdewolff opened this issue 4 years ago • 0 comments

From https://github.com/tdewolff/parse/issues/62

Quickly map []byte to numbers from 0 to N, for a pre-defined set of N strings. Make []byte => Hash fast. Move the Hasher code into this repository.

Hash the []byte, check if it is lower than N, if so verify that it is the same string using bytes.Equal. The latter could be skipped if we are sure there are no collisions below a certain length of input bytes (can we optimize the search for hash seeds for this?).

Promising hash functions are https://github.com/dgryski/go-metro, https://github.com/dgryski/go-farm, and https://github.com/cespare/xxhash. Especially farm seems to be fast for short strings (len < 50).

See http://aras-p.info/blog/2016/08/09/More-Hash-Function-Tests/ for comparisons.

tdewolff avatar May 11 '21 02:05 tdewolff