objecthash
objecthash copied to clipboard
Option for array order independent hashing
I am currently looking for object hashing strategies to quickly tell by comparison of the hashsum of two objects o1
and o2
whether they have identical contents. One important requirement is that the order of items of an array shouldn't matter since a particular order can't be guaranteed or isn't important. So when hashing o1
and o2
where o1
and o2
are defined as
var o1 = ["foo", {"bar": ["baz", null, 1.0, 1.5, 0.0001, 1000.0, 2.0, -23.1234, 2.0]}]
var o2 = ["foo", {"bar": [null, "baz", 1.0, 1.5, 0.0001, 1000.0, 2.0, -23.1234, 2.0]}]
then the hash for o1
and o2
should be equal even though the item oder of bar
differs. The hashsum should remain to be distinct from o3
with
var o3 = [{"bar": ["baz", null, 1.0, 1.5, 0.0001, 1000.0, 2.0, -23.1234, 2.0]}]
which misses the item "foo"
.
Use cases I have in mind are change detection and caching strategies. E.g. clients with a REST-API could decide upon comparing a hashsum against an E-Tag HTTP-Header whether to update a client-side cache.
Array order independence could be achieved by sorting item hashes in the array prior to calculating the hashsum over the items (before updating the digester, respectively) - similar to how it is already implemented for hashObject()
. By sorting the hashes of array items rather than sorting raw values we can sort lexicographically and gain reproducable hash sums independent of whether the raw item values are sortable.
I've implemented a demo and test cases for objecthash-js by @emschwartz (diff). It adds a new option to tweak the algorithm.