256-bit variant MurMurMurMur
Will there ever be a 256-bit version of MurMur for avoiding hash collisions in larger datasets?
I guess it would then not as much deviate from SHA-256, according to performance and distribution quality (note I said SHA, not SHA3). Anyway the performance comparison of both looks like:
| Method | Calculate over 32 bytes | Calculate over 1KB | ||
|---|---|---|---|---|
| Performance | Rate | Performance | Rate | |
| MM3-128 | 0.236128 µs/# | 4234986 #/sec | 0.462609 µs/# | 2161655 #/sec |
| SHA-256 | 0.487272 µs/# | 2052241 #/sec | 3.917825 µs/# | 255243 #/sec |
Why I think so - because too many nested shifts/mixes expected by interim calculations on (non-native) 256-bit "integers". This is comparable to use MM64-128 compiled for 32-bit systems (due to non-native or missing 128-bit int's there) - just try to compare by yourself, you'd see similar performance "degradation".
Note that MM3 is good cascade-able (if we're talking about not crypto-strong hashing, to avoid collisions only)... thus combining N hashes you'll be able to build hash with sizes 256, 384, 512, etc bits. So you can simply try to use 2-blocks MM3-128 for hashing one by one the blocks of large strings each 32 bytes (128 bits): so calculate first 32-bytes block of string into one hash, second into another, 3rd again into first hash, etc, then you'd reach almost the same performance (comparable with 0.46µs for 1KB like above), and your both 128-bits hashes representing one 256-bits hash.