containers
containers copied to clipboard
better instance Hashable IntSet?
cross reference to https://github.com/haskell-unordered-containers/hashable/issues/269
for a given set of elements, the IntSet representation is unique? E.g., "the mask is the largest position ..." https://github.com/haskell/containers/blob/master/containers/src/Data/IntSet/Internal.hs#L261
Then a hash function could just use structure and contents of the internal representation generically?
@jwaldmann Correct.
Correct, but it would be wasteful. The Tips contain all the elements, it would be sufficient to just hash the contents of those (ignoring Bins).
Correct, but it would be wasteful. The
Tips contain all the elements, it would be sufficient to just hash the contents of those (ignoringBins).
Oh yeah, I think there is redundant info there. I'd have to review the representation; it's been a little while.
Ah yes, the Tip includes the prefix.
We could offer
serialize :: (Word -> r -> r) -> r -> IntSet -> r
This could be used to hash efficiently, with serialize written so it can inline. What would be a good matching deserialize type though?
- serialize: could we write
instance Ord IntSetwith that? (#787 #470) - deserialize: don't know. what would be the use case?
I doubt that's quite what we'd want for Ord, but maybe close. The use case is that I'd be more comfortable with "publicly" exporting serialize/deserialize than a function that's only good for writing the Hashable instance.
I think we should not mix up serialization and hashing.
Serialization typically requires the size before the elements, which means we must make two traversals for IntMap/IntSet.
Hashing also wants the size for a good hash, but it is fine to add it at the end, which means we can be more efficient and calculate the hash and size in one traversal.