digest-murmurhash icon indicating copy to clipboard operation
digest-murmurhash copied to clipboard

Support for seed > 4 characters

Open maksimf opened this issue 8 years ago • 3 comments

Now if I try to pass seed longer than 4 characters I get an ArgumentError:

Digest::MurmurHash3_x86_32.rawdigest('foo', '1234567') 
# => ArgumentError: seed string should be 4 length

Moreover, I think that passing Fixnum as a seed should be supported as well:

Digest::MurmurHash3_x86_32.rawdigest('qwe', 1234)
# => TypeError: no implicit conversion of Fixnum into String

maksimf avatar Dec 30 '16 11:12 maksimf

uint_32_t is used as a type for seed in the C extension: https://github.com/ksss/digest-murmurhash/blob/master/ext/digest/murmurhash/3_x86_32.c#L8

maksimf avatar Dec 30 '16 11:12 maksimf

Thank you for using this gem :)

I think it is useful when seed value have itself fit uint32_t or uint64_t Let me think about in this case

Digest::MurmurHash3_x86_32.rawdigest('qwe', 12345678901234567890)
#=> cast to uint32_t? or raise error?
  • Cast: I think it's too implicit. Which is endian?
  • Error: I think it's not usefull

So, I don't have good ideas


I recommend like this if you want to use a number.

Digest::MurmurHash3_x86_32.rawdigest('qwe', [1234].pack("L"))
=> 3180601568

ksss avatar Dec 30 '16 13:12 ksss

I'm not sure which way is better either, but I believe it's not 100% obvious the way it works now, just throwing exception when seed > 4. Maybe you can get some ideas from another murmur implementation, like this? https://github.com/funny-falcon/murmurhash3-ruby

Anyway, thank you for showing the way to use it with pack, I'll use it now

maksimf avatar Dec 30 '16 14:12 maksimf