audfprint icon indicating copy to clipboard operation
audfprint copied to clipboard

matching many files

Open aminebya1 opened this issue 5 years ago • 8 comments

is it possible to modify the code so he can match many files once

aminebya1 avatar Dec 10 '19 23:12 aminebya1

The matching is intrinsically 1 query against the whole reference set, but each match is quite quick so you can run through a lot of queries in a loop.

The idea of looking for any matches among a large collection is an interesting variant. There might be a more efficient way of structuring that, but it's not trivial.

dpwe avatar Dec 10 '19 23:12 dpwe

can u tell me where to modify or give me some hints

aminebya1 avatar Dec 10 '19 23:12 aminebya1

Thanks for this wonderfull code which works very good. In My case I dont know whether a chunk could have 1 or 2 or 3 diffferent file matches. so I am using --max-matches 5 but using this I am getting duplicate matches of the same audio file. could you please help me about that ?

vikasmultitv avatar Jan 15 '20 09:01 vikasmultitv

There's a field, Matcher.max_alignments_per_id, which controls how many matches are returned for each ref item. It's 100 by default and there's no way to control it at present. You could try manually restricting it (line 122 of audfprint_match.py), or you could add a new command-line option and set it as part of audfprint.setup_matcher (around line 316 of audfprint.py). I believe it will keep the most significant matches first. This won't affect the behavior when --exact-count is true.

DAn.

On Wed, Jan 15, 2020 at 4:20 AM vikasmultitv [email protected] wrote:

Thanks for this wonderfull code which works very good. In My case I dont know whether a chunk could have 1 or 2 or 3 diffferent file matches. so I am using --max-matches 5 but using this I am getting duplicate matches of the same audio file. could you please help me about that ?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/dpwe/audfprint/issues/73?email_source=notifications&email_token=AAEGZUIU32HJZBEW2BB47ELQ53IOXA5CNFSM4JZGBYQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEI7TWTI#issuecomment-574569293, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEGZUNAZMNK6XN6KDU2L6LQ53IOXANCNFSM4JZGBYQQ .

dpwe avatar Jan 15 '20 13:01 dpwe

Thank You for replying :) Tried the same self.max_alignments_per_id = 100 to self.max_alignments_per_id = 0 and it worked but it miss if there is only 1 match.. I guess I will try some other way around in post processing

vikasmultitv avatar Jan 15 '20 14:01 vikasmultitv

changing if found_this_id > self.max_alignments_per_id to if found_this_id >= self.max_alignments_per_id: Worked fine :) Thank You

vikasmultitv avatar Jan 15 '20 14:01 vikasmultitv

Glad it worked! max_alignments_per_id = 0 seems like an odd choice, I was thinking max_alignments_per_id = 1 (we want at most 1 hit per reference item), but as long as it's doing what you want, it's your fork!

DAn.

On Wed, Jan 15, 2020 at 9:15 AM vikasmultitv [email protected] wrote:

changing if found_this_id > self.max_alignments_per_id to if found_this_id >= self.max_alignments_per_id: Worked fine :) Thank You

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/dpwe/audfprint/issues/73?email_source=notifications&email_token=AAEGZUJ6OU6C4FG64OKTJ5DQ54LA3A5CNFSM4JZGBYQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJAOH2Y#issuecomment-574677995, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEGZUJIY72YIOLFYJTS7RTQ54LA3ANCNFSM4JZGBYQQ .

dpwe avatar Jan 15 '20 17:01 dpwe

I am sorry , its my bad .. actually it is not working ... its missing the match when there is only one match in a chunk.. my chunk is of 1 minute and it can have 2-3 audio files ..

vikasmultitv avatar Jan 15 '20 17:01 vikasmultitv