ohash icon indicating copy to clipboard operation
ohash copied to clipboard

mumurhash32 collision between `ufo` and `vue`

Open danielroe opened this issue 3 years ago • 4 comments

require('ohash').hash('ufo')
// '2247144487'
require('ohash').hash('vue')
// '2247144487'

related: https://github.com/nuxt/framework/pull/5398

danielroe avatar Jul 08 '22 14:07 danielroe

Another collision:

/node_modules/vue-toastification/dist/index.css // 944007393
/node_modules/vue-toastification/dist/index.mjs // 944007393

Ref: https://github.com/nuxt/nuxt.js/issues/14313

pi0 avatar Jul 14 '22 10:07 pi0

Using SHA2 in latest 0.1.x https://github.com/unjs/ohash/pull/12

pi0 avatar Jul 14 '22 12:07 pi0

Just a shot in the dark, but it seems collisions can be easily avoided with padEnd?

import { murmurHash } from 'ohash'

const fixedHash = (s) => murmurHash(s.padEnd(s.length + (4 - (s.length & 3)) & 3, '\0'))

fixedHash('vue') // => 2175094470
fixedHash('ufo') // => 3195201511

fixedHash('/node_modules/vue-toastification/dist/index.mjs')
// => 1385894279
fixedHash('/node_modules/vue-toastification/dist/index.css')
// => 3153126526

I'm not sure if this really fixes it or not, to be honest. :)

From this article:

No collisions are possible for 4-byte keys

So perhaps making sure that remainder is always zero fixes it? 😅 https://github.com/unjs/ohash/blob/58a0d67ce9f3ed4b33526ea4c254a24e13fe290e/src/crypto/murmur.ts#L18

My hypothesis is basically, "No collisions exist for modulus 4 length strings." Again, no idea if it's sound.

aleclarson avatar Sep 01 '22 04:09 aleclarson

Really nice point @aleclarson thanks for sharing article. Do you mind to open a PR directly adding fix to murmurHash function? (Also maybe few tests)

pi0 avatar Sep 01 '22 05:09 pi0