mir_eval consistency between metrics (specifically F-measure)

consistency between metrics (specifically F-measure)

Open andyr0id opened this issue 11 years ago • 2 comments

Hi, thanks for this very useful library.

I have noticed a small inconsistency between some metric function outputs that are shared between some of the submodules. F-measure is one of them.

When evaluating based of F-measure it is often useful to know the precision and recall values. Calculating F-measure in the "onset" submodule returns this information, whereas the "beat" submodule does not.

Oct 20 '14 15:10 andyr0id

mir_eval.beat.f_measure only returns F-measure because only F-measure is used by the beat evaluation toolbox and MIREX; mir_eval.onset.f_measure returns all three because MIREX uses all three. I'd potentially support changing mir_eval.beat.f_measure to return all three, but FWIW mir_eval.beat.f_measure and mir_eval.onset.f_measure are functionally identical (except that the threshold/window is .05 and .07 in onset and beat respectively) so you can safely use mir_eval.onset.f_measure in place of mir_eval.beat.f_measure to get all three if you set the window kwarg to .07. At any rate, to implement this change I'd like to get some sort of community consensus first.

Oct 21 '14 07:10 craffel

Yes that's what I've been doing as a work around.

I see your point, but the counter argument is that by open sourcing this library it is now not necessarily tied to the one MIREX implementation, but a community resource.

Either way it's a small issue that will come down to the ethos of the maintainers.

Oct 21 '14 11:10 andyr0id

mir_eval mir_eval copied to clipboard

consistency between metrics (specifically F-measure)

mir_eval
mir_eval copied to clipboard