deepdiff Add an ignore_order to DeepHash

It appears DeepHash ignores the order of the given object by default to compute a combined hash.

# 3 example objects
x = {'a':0, 'b':[1,2,3]} # a baseline example object
y = {'b':[1,2,3],'a':0}  # key order swapped 
z = {'a':0, 'b':[2,1,3]} # swapped positions in list for key 'b' 

# in all examples, the combined hash is the same
print (DeepHash(x)[x]) # '343d77f8a45dac16bc49a7be37c1ee73250ac4311e316862393f3c2552ff5b64'
print (DeepHash(y)[y]) # '343d77f8a45dac16bc49a7be37c1ee73250ac4311e316862393f3c2552ff5b64'
print (DeepHash(z)[z]) # '343d77f8a45dac16bc49a7be37c1ee73250ac4311e316862393f3c2552ff5b64'

It would be incredibly useful to respect order when computing hash signatures of complex data structures, something like: DeepHash(x, ignore_order=False)[x] == DeepHash(z, ignore_order=False)[z] # Returns False

Allowing dict keys as an exception would also be great to give more flexibility: DeepHash(x, ignore_order=False, sort_dict=True)[x] == DeepHash(y, ignore_order=False, sort_dict=True)[y] # Returns True DeepHash(x, ignore_order=False, sort_dict=True)[x] == DeepHash(z, ignore_order=False, sort_dict=True)[z] # Returns False

Aug 10 '22 17:08 Eric-Vignola

@Eric-Vignola interesting idea. Currently DeepDiff uses DeepHash to figure out identical objects before it starts digging into the ones that are not identical. Then and only then inside DeepDiff we start ignoring order between these nonidentical objects.

What you are asking also needs a rewrite into how we serialize objects. A non-trivial amount of work needs to be done for that to happen.

Aug 14 '22 02:08 seperman

https://github.com/seperman/deepdiff/issues/373

This is my issue pointing the same. I've closed it thinking it was silly question 😄

Feb 08 '23 08:02 Okroshiashvili

deepdiff deepdiff copied to clipboard

Add an ignore_order to DeepHash

deepdiff
deepdiff copied to clipboard