grain icon indicating copy to clipboard operation
grain copied to clipboard

feat(stdlib)!: Add seededHash

Open spotandjake opened this issue 1 year ago • 1 comments

This pr refactors the hashing function so we do not keep global state, it also implements a new Hash.makeSeeded and Hash.make function which returns a HashInstance. If we want a different api I am happy to change things around.

API:

let globalHash = Hash.make()
assert Hash.hash(globalhash, 1) == Hash.hash(globalHash, 1)
assert Hash.hash(globalhash, "a") == Hash.hash(globalHash, "a")
let seeded1 = Hash.makeSeeded(1)
assert Hash.hash(seeded1, 1) == Hash.hash(seeded1, 1)
assert Hash.hash(seeded1, "a") == Hash.hash(seeded1, "a")
let seeded2 = Hash.makeSeeded(2)
assert Hash.hash(seeded1, 1) != Hash.hash(seeded2, 1)

Questions:

  • Should we generate a seperate seed on Hash.make every time or do we want to use the globalSeed? Maybe if we want a globalSeed we should provide it a makeGlobal.
  • Should we use makeSeeded with a set seed on Map and Set so we can get rid of the possible exception.

My original implementation had the signature of Hash.make: (seed: Number) => (anything: a) => Number but this causes our inference engine to lock the hashing function a particular data type when you do something like:

let hash = Hash.make()
assert hash("H") == hash("H")
assert hash(1) == hash(1) // fails to typecheck because `a` is now `String` so you would need to make a separate instance for each dataType.

The alternative approach suggested was to use a submodule for seeded but as the api's would differ this did not seem ideal.

One benefit to this approach is because we call Random.random() on make we will not throw a possible exception on Hash.hash which means all of our Set and Map functions besides make wont throw.

Succeeds: #1952

This is breaking as we changed the type of Hash.hash from (a) => Number to (HashInstance, a) => Number.

spotandjake avatar Oct 04 '24 19:10 spotandjake

I applied that change.

Another small question on this pr, should we fix the seed used for Map and Set to keep the order consistent between runs on operations like toList this helps to preserve same input == same output semantics.

spotandjake avatar Oct 20 '24 23:10 spotandjake

Another small question on this pr, should we fix the seed used for Map and Set to keep the order consistent between runs on operations like toList this helps to preserve same input == same output semantics.

We don't guarantee order currently and we don't want to make a change that makes people think the order is guaranteed, unless we decide we want to guarantee order. Either way, that's not a decision for this PR.

ospencer avatar Oct 30 '24 16:10 ospencer