imagehash icon indicating copy to clipboard operation
imagehash copied to clipboard

ImageMultiHash `__sub__` is not commutative

Open nh2 opened this issue 2 years ago • 3 comments

``subonImageMultiHash` on 2 different images is in most cases not commutative.

For example,

a - b = 4.25
b - a = 10.25

hash_diff() is communative. It is very confusing to me that from a commutative result (the (matches, sum_distance) tuple), a non-commutative distance metric is derived.

This creates nonsensical results when trying to find the most similar match for every image in a set of images (e.g. using best_match()), because the sorting by metric now depends on the order in which the images are given.

It seems to me that best_match() would be better implemented by simply sorting the (matches, sum_distance) results descending by matches, ascending by sum_distance, instead of going via this non-commutative __sub__.

Do you agree?

(Also, __sub__ on ImageHash is commutative.)

nh2 avatar Nov 25 '21 17:11 nh2

pinging @joshcoales who contributed this code and perhaps has an opinion.

JohannesBuchner avatar Nov 25 '21 20:11 JohannesBuchner

Apologies, yeah, I remember this being an absolute pain.

I agree that sorting matches descending, then sum_distance ascending would be best. I was trying to merge that down into a single int for comparison, but it's a bit of a nightmare, yeah.

SpangleLabs avatar Jan 10 '22 16:01 SpangleLabs

Perhaps you could prepare a PR (plus test of commutativity, perhaps for all hash types), and add comments with some demonstrations of the behaviour before and after.

JohannesBuchner avatar Jan 11 '22 03:01 JohannesBuchner