dejavu
dejavu copied to clipboard
Multiple identifications in one recording
If there are multiple songs present in a file passed to FileRecognizer, is there a way to specify that all recognitions be returned? Right now, it appears that only the first is returned.
@thesunlover please answer :)
@shemul @richoakley that is behaviour by default. you'd better customize a local version for the purpose by yielding the results
Did you have any luck with this? I have a similar issue, where only part of the clip should match the key, but should match the entire key (lots of noise before and after).
I have also the same problem. I have fingerprinted a short audio file and then I concatenated it with two songs like:
- Song
- ShortAudio
- Song
- ShortAudio
But when I analyze the concatenated file the result is only one found match.
maurizio96 May I ask why do you need it to work like that? Can't be there a single fingerprint for uniqueness? and counting the times matched you may also make a new function for lookup with a starting timepoint, but usually you are looking for a 6 to 7 seconds record in the big DB
I need it to work like that because I have 1 hour record file with the short audio played several times (like 4 times per hour) and I would like to get the start position of those times
oh, ok, then you might need a new search function with a starting timepoint, I am going to have a look
Thank you! I hope I explain my problem correctly.. My goal is recognize the positions where my file is placed (inside the big file)
The main changes are going to be in find_matches() & align_matches() can you check what is the return from db.return_matches()
I have focused the functions.. But I don't understand how can I change them to get my goal..
You think that the problem depend from this code?
if diff_counter[diff][sid] > largest_count: largest = diff largest_count = diff_counter[diff][sid] song_id = sid
I think this cause the only one result in recognition analysis
yep, note: I don't have any working copy of the repo so I cannot see what data is coming from the two functions. I just know that in this function
def _recognize(self, *data): matches = [] for d in data: matches.extend(self.dejavu.find_matches(d, Fs=self.Fs)) return self.dejavu.align_matches(matches)
the data consists of one or two channels. so d
's should be one or two so you have to match both match data if files are two channels
With current workink copy of the repo I can get the ShortAudio position by getting the key 'offset_seconds' from recognize result. But unfortunately it lists only one result. I am trying to edit it to get multiple results by change the code:
if diff_counter[diff][sid] > largest_count: largest = diff largest_count = diff_counter[diff][sid] song_id = sid
From what I understand it is limited to one result because of the 'if'
it seems to be designated for a single match as the iteration is on the values that are being looked up (by 1k HASHes per single DB-call) note: I am not familiar enough with SQL language, so I am not sure I can help now
I cannot get it. I see the query "SELECT_MULTIPLE" but it seems to select songs based on HASH file. I have only one record in the DB that matches to the ShortAudio. If I analyze a file with the following structure:
- UnknownSong
- ShortAudio
- UnknownSong
- ShortAudio
- UnknownSong
But it (as you say) return only one match but I would like to get the start position of both ShortAudio.
Sorry for many messages but I am trying to be as clear as possible
you'd better ask in the stackoverflow how to modify the SQL SEARCH criteria to select multiple pieces from the same file
Sorry but I cannot understand why I need to edit SQL SEARCH. If the align_matches() function is called once how can it return multiple result if the following code is done?
if diff_counter[diff][sid] > largest_count: largest = diff largest_count = diff_counter[diff][sid] song_id = sid song = self.db.get_song_by_id(song_id) if song: songname = song.get(Dejavu.SONG_NAME, None) else: return None nseconds = round(float(largest) / fingerprint.DEFAULT_FS * fingerprint.DEFAULT_WINDOW_SIZE * fingerprint.DEFAULT_OVERLAP_RATIO, 5) song = { Dejavu.SONG_ID : song_id, Dejavu.SONG_NAME : songname, Dejavu.CONFIDENCE : largest_count, Dejavu.OFFSET : int(largest), Dejavu.OFFSET_SECS : nseconds, Database.FIELD_FILE_SHA1 : song.get(Database.FIELD_FILE_SHA1, None),} return song
The song_id is replaced every time the 'if' condition is true and it return only one object for match. How can it return multiple songs object. I am studying the SQL build script but it is more complicated...
Note: you rare helping me a lot :+1: :smile: thank you!
I have a very similar use case, The method align_matches(self, matches) is very interesting. Can anyone shade some light how this selection works? If so, how could we match multiple results?
Especially, the diff variable which is obtained from db?
any luck with this ?
same problem here, any idea for finding multiple matches in a file or from microphone ?
Same problem, i need to identify the time of multiple matches in the file. Any solution?
Did anybody manage to have multiple results? When I try to recognise a song I would like to get many results in decrescent number of confidence. In this way I can find if a short song sequence is inserted in more than a song in the library. I'm trying to modify "def align_matches(self, matches)" but not working yet.
This will return all matches sorted by confidence.
def align_matches(self, matches):
"""
Finds hash matches that align in time with other matches and finds
consensus about which hashes are "true" signal from the audio.
Returns a dictionary with match information.
"""
# align by diffs
diff_counter = {}
song_ids = {}
for tup in matches:
sid, diff = tup
if diff not in diff_counter:
diff_counter[diff] = {}
if sid not in diff_counter[diff]:
diff_counter[diff][sid] = 0
diff_counter[diff][sid] += 1
for diff in diff_counter:
for sid in diff_counter[diff]:
if sid not in song_ids:
song_ids[sid] = [0, '']
if diff_counter[diff][sid] > song_ids[sid][0]:
song_ids[sid][0] = diff_counter[diff][sid]
song_ids[sid][1] = diff
songs_detailed = []
for song_id in song_ids:
confidence, offset = song_ids[song_id]
# extract idenfication
song = self.db.get_song_by_id(song_id)
if song:
nseconds = round(float(offset) / fingerprint.DEFAULT_FS *
fingerprint.DEFAULT_WINDOW_SIZE *
fingerprint.DEFAULT_OVERLAP_RATIO, 5)
songs_detailed.append({
Dejavu.SONG_ID : song_id,
Dejavu.SONG_NAME : song.get(Dejavu.SONG_NAME, None),
Dejavu.CONFIDENCE : confidence,
Dejavu.OFFSET : int(offset),
Dejavu.OFFSET_SECS : nseconds,
Database.FIELD_FILE_SHA1 : song.get(Database.FIELD_FILE_SHA1, None)})
return sorted(songs_detailed, key=lambda x: x[Dejavu.CONFIDENCE], reverse=True)
Remove this line from recognize.py->recognize_file function
if match:
match['match_time'] = t
or replace with
for m in match:
m.update({'match_time': t})