mir_eval Acceleration || branching from #68

The issue of acceleration came up in #68 , either by numba or by cython. It seems like we should branch the discussion off into its own issue.

Cython

Pro: minimal dependencies, pip installable, wheels
Con: packaging can be tricky, significant developer time required

Numba

Pro: minimal developer time required
Con: not pip installable due to LLVM dependency

My two cents: despite the dependency chain problem, I'd prefer numba over cython because it would be easier to deploy across the entire module.

In librosa, we handled this by making the numba dependency optional, and providing a dummy decorator optional_jit that does a no-op if numba is not available, and jit-compilation if it is. That way, everything still works, but if you want to get the acceleration benefits, it's on you to install numba.

Side note: if we also add conda(-forge) package distribution, we can easily add numba as a hard dependency there.

What do y'all think?

Feb 24 '17 18:02 bmcfee

Side note: if we also add conda(-forge) package distribution, we can easily add numba as a hard dependency there.

+1!

Feb 24 '17 18:02 justinsalamon

What part of mir_eval is too slow? I think most of the metrics are reasonably fast, and those that aren't could probably benefit most from some regular Python hand-optimization. I don't think it's worth considering Cython, given the goal of "a transparent implementation". numba would be fine with the optional jit decorator.

Feb 27 '17 17:02 craffel

What part of mir_eval is too slow?

The separation module is particularly slow. Yes, it could benefit from a thorough efficiency audit -- I've noticed some pretty obvious examples of redundant computation -- but I think a lot of its slowness comes from tight-loop calculations that don't trivially vectorize, but would be easily handled by numba.

Some of the segmentation / hierarchy metrics could also benefit from optimization.

I haven't benchmarked the rest of it, but those alone seem worth the effort to me.

Feb 27 '17 18:02 bmcfee

I think as a first pass it's worth looking in detail at what metrics are prohibitively slow and if there's an easy way to make them faster without numba (since we can assume for the time being most users will not have numba installed). Adding numba + optional_jit seems like a no-brainer after that.

Feb 27 '17 18:02 craffel

Sure.

Side note: is it worth making up a separate issue to put mir_eval in conda-forge?

Feb 27 '17 18:02 bmcfee

Side note: is it worth making up a separate issue to put mir_eval in conda-forge?

Sure, wouldn't hurt. I would prioritize it based on how much people are complaining about it not being in conda-forge.

Feb 27 '17 19:02 craffel

:complain:

😄

Feb 28 '17 09:02 stefan-balke

About which functions?

Feb 28 '17 17:02 craffel

Btw, numba now has wheels on pypi, so we probably could make it a hard dependency.

Jul 13 '17 13:07 bmcfee

Update: llvmlite >=0.20 (now up) ships with windows wheels.

Sep 08 '17 19:09 bmcfee

The separation metrics are too slow!!! It takes up all the time to compute the score when I was training my model. Could you please add jit to those eval functions, e.g. bss_eval_sources.

Oct 23 '19 02:10 SunDoge

Could you please add jit to those eval functions

I'm not sure that jitting would help for bss eval. Last I checked, the bottleneck here was in computing a bunch of frame-wise least squares solutions, and that part is already calling out to C under the hood (via numpy/scipy).

But do feel free to try it out and send a PR if it improves.

Oct 23 '19 14:10 bmcfee

I'm not sure that jitting would help for bss eval. Last I checked, the bottleneck here was in computing a bunch of frame-wise least squares solutions, and that part is already calling out to C under the hood (via numpy/scipy).

Can confirm @bmcfee I tried to to strip out some parts for numba and got a bit more than 5% speedup

Oct 23 '19 14:10 faroit

mir_eval mir_eval copied to clipboard

Acceleration || branching from #68

Cython

Numba

mir_eval
mir_eval copied to clipboard