mir_eval
mir_eval copied to clipboard
consistency between metrics (specifically F-measure)
Hi, thanks for this very useful library.
I have noticed a small inconsistency between some metric function outputs that are shared between some of the submodules. F-measure is one of them.
When evaluating based of F-measure it is often useful to know the precision and recall values. Calculating F-measure in the "onset" submodule returns this information, whereas the "beat" submodule does not.
mir_eval.beat.f_measure only returns F-measure because only F-measure is used by the beat evaluation toolbox and MIREX; mir_eval.onset.f_measure returns all three because MIREX uses all three. I'd potentially support changing mir_eval.beat.f_measure to return all three, but FWIW mir_eval.beat.f_measure and mir_eval.onset.f_measure are functionally identical (except that the threshold/window is .05 and .07 in onset and beat respectively) so you can safely use mir_eval.onset.f_measure in place of mir_eval.beat.f_measure to get all three if you set the window kwarg to .07. At any rate, to implement this change I'd like to get some sort of community consensus first.
Yes that's what I've been doing as a work around.
I see your point, but the counter argument is that by open sourcing this library it is now not necessarily tied to the one MIREX implementation, but a community resource.
Either way it's a small issue that will come down to the ethos of the maintainers.