abracadabra icon indicating copy to clipboard operation
abracadabra copied to clipboard

Bug in hash_point_pair?

Open matejaputic opened this issue 2 years ago • 3 comments

Can you explain how this is not a bug?

https://github.com/notexactlyawe/abracadabra/blob/ac8e080ae2ba4c582eb5842139ab7e5082b4cff0/abracadabra/fingerprint.py#L67

It seems this function will always return zero for time delta.

Shouldn't it be return hash((p1[0], p2[0], p2[1]-p1[1]))?

matejaputic avatar Sep 09 '22 19:09 matejaputic

@matejaputic Have you experimented with this code? for me baase implementation works great for looking for exact matches (recognise x when x was previously added via register) and for fragments of audio extracted from x. it doesn't work for very similar fragments or recordings (when exact recording wasn't added via register). Changing hashing to your suggestion doesn't really help tho. It boosts scores for exact matches but stops to work well for fragments of registered audio and still sucks for searching similar audio.

niemiaszek avatar Dec 02 '22 19:12 niemiaszek

I just stumbled over that line too. Dont know what it should be, but p2[1]-p2[1] doesnt look right.

puhoy avatar Dec 17 '23 15:12 puhoy

@matejaputic Have you experimented with this code? for me baase implementation works great for looking for exact matches (recognise x when x was previously added via register) and for fragments of audio extracted from x. it doesn't work for very similar fragments or recordings (when exact recording wasn't added via register). Changing hashing to your suggestion doesn't really help tho. It boosts scores for exact matches but stops to work well for fragments of registered audio and still sucks for searching similar audio.

The suggested change is correct and this is a typo. Like you said, I still didn't find it was particularly good at recognising anything other than directly cutting parts of the original song. I played around with the find_peaks() to use the technique from https://github.com/worldveil/dejavu/blob/master/dejavu/logic/fingerprint.py#L55C4-L55C4 and it's working much better now. The technique the dejavu folk are using for generating the peaks seems to be much more reliable than the naive maximum_filter() that is currently being used here.

drewsilcock avatar Jan 15 '24 17:01 drewsilcock