onnx-mlir icon indicating copy to clipboard operation
onnx-mlir copied to clipboard

Improve PerfectHash code to avoid infinite loops.

Open etiotto opened this issue 3 years ago • 1 comments

etiotto avatar Nov 17 '21 20:11 etiotto

See this issue for a few suggestions on how to avoid the infinite loop.

https://github.com/onnx/onnx-mlir/issues/941#issuecomment-954221677

I came to realize that V and G can have different sizes: grow G to have a bigger set of hash values that then map to the V array Or grow V with the same sized G to have more entries and thus more possibilities.

This is exemplified by the fact that the modulo in the evaluation function are separate for the Gs and the Vs, see below.

# Look up a value in the hash table, defined by G and V.
def PerfectHashLookup( G, V, key ):
    d = G[hash(0,key) % len(G)]
    if d < 0: return V[-d-1]
    return V[hash(d, key) % len(V)]

Personally, I prefer the growing of G as this is only an array of numbers, whereas V and the original keys are possibly larger data types (including int64 and strings).

AlexandreEichenberger avatar Nov 17 '21 21:11 AlexandreEichenberger