mir_eval
mir_eval copied to clipboard
Chroma transcription metrics
To cover all evaluation done in MIREX, we also need to add the ability to evaluate transcription annotations after mapping the pitch values to a chroma (single octave) scale, as discussed in #180.
Punting to 0.5.
I'd like to take a stab at this. Seems like the way to do this is add a flag in precision_recall_f1_overlap and match_notes?
not sure if it'll be of any help, but there's machinery in mir_eval.chord that might be useful??
I'd like to take a stab at this.
Cool, contributions welcome. @justinsalamon will know best what is necessary.
Seems like the way to do this is add a flag in precision_recall_f1_overlap and match_notes?
Seems that way to me too.
This should be relatively straight forward. The metrics (excluding ones that only consider onsets or offsets) rely on match_notes, which computes pitch distances here and checks for pitch matches here. You'd have to add a flag which, if set, calls an alternative version of these lines which checks for matches in (cyclic) chroma space.
I've got a 3 line solution that makes sense to me here. Still need to test.
Don't know a good way to generate test data. Any advice @justinsalamon?
@chf2117 that looks right to me (but should be tested of course). To test that it really does what you expect it to you need to add unit tests to test_transcrpition.py. The toy data for the unit tests is hard coded here, if it's sufficient for covering all the chroma cases great, if not you should create more hard coded data (but don't add to / change the existing data as that will break the existing tests!).
Data for regression tests lives here, no need to touch est* and ref* but you'll have to update the output* files.
Finally, to check your output against the MIREX results see this comment in the original transcription pull request.