abracadabra
abracadabra copied to clipboard
Bug in hash_point_pair?
Can you explain how this is not a bug?
https://github.com/notexactlyawe/abracadabra/blob/ac8e080ae2ba4c582eb5842139ab7e5082b4cff0/abracadabra/fingerprint.py#L67
It seems this function will always return zero for time delta.
Shouldn't it be return hash((p1[0], p2[0], p2[1]-p1[1]))
?
@matejaputic Have you experimented with this code? for me baase implementation works great for looking for exact matches (recognise x when x was previously added via register) and for fragments of audio extracted from x. it doesn't work for very similar fragments or recordings (when exact recording wasn't added via register). Changing hashing to your suggestion doesn't really help tho. It boosts scores for exact matches but stops to work well for fragments of registered audio and still sucks for searching similar audio.
I just stumbled over that line too. Dont know what it should be, but p2[1]-p2[1]
doesnt look right.
@matejaputic Have you experimented with this code? for me baase implementation works great for looking for exact matches (recognise x when x was previously added via register) and for fragments of audio extracted from x. it doesn't work for very similar fragments or recordings (when exact recording wasn't added via register). Changing hashing to your suggestion doesn't really help tho. It boosts scores for exact matches but stops to work well for fragments of registered audio and still sucks for searching similar audio.
The suggested change is correct and this is a typo. Like you said, I still didn't find it was particularly good at recognising anything other than directly cutting parts of the original song. I played around with the find_peaks()
to use the technique from https://github.com/worldveil/dejavu/blob/master/dejavu/logic/fingerprint.py#L55C4-L55C4 and it's working much better now. The technique the dejavu folk are using for generating the peaks seems to be much more reliable than the naive maximum_filter()
that is currently being used here.