rvlib icon indicating copy to clipboard operation
rvlib copied to clipboard

Poor performance in Jitted loops

Open jstac opened this issue 8 years ago • 4 comments

@danielcsaba @spencerlyon2

I'm getting poor performance here in exactly the kind of scenario where I would like to use RVlib:

https://gist.github.com/anonymous/c3387ab7408609a49472c3020517bbd6

Any ideas on what's happening?

jstac avatar Nov 24 '16 23:11 jstac

Hi @jstac,

I had a quick look and it's failing in nopython mode---seemingly due to a type clash. My first attempts at fixing it were not successful. I'll try and look into it in more detail.

danielcsaba avatar Nov 25 '16 20:11 danielcsaba

I don't have the full sulution right away but there are two comments that can be made:

  • nopython mode doesn't work because the class is defined out of the jitted loop. Moving its definition inside the function (or passing it as argument solves this problem. I don't know whether it is a long run limitation of numba.
  • f.rand(1) returns a newly allocated 1d array, so that one should do f.rand(1)[0] to get the associated scalar. In terms of API, one might expect f.rand() to return a scalar ? The following code takes 2s, vs 0.5s for the other options. I would hazard the remaining gap might come from the aforementioned array creation and maybe some overhead associated to objects:
@jit(nopython=True)
def ar1_sample_mean(N, alpha, beta, s):
    f = Normal(0, 1)
    x = beta / (1 - alpha)
    sm = 0.0
    for i in range(N):
        x = beta + alpha * x + s * f.rand(1)[0]
        sm += x
    return sm / N

albop avatar Dec 10 '16 17:12 albop

@albop Thanks. I tried and got the same result. That's quite reassuring.... But still not as quick as with np.random.randn(), as you say.

It's not great to have this fragility, but perhaps OK if we give good documentation and examples.

I think f.rand() should return a scalar, and I wonder if that wouldn't make a difference (i.e., improve the speed a bit more)

jstac avatar Dec 14 '16 09:12 jstac

hi @jstac and @albop

thanks Pablo for pointing out the issue.

regarding the f.rand(), our problem was that jitclass was not able to handle optional and default arguments so we had to pass the number as an argument no matter its value.

I reckon the np.random.randn is the best integrated with numba. The jitclass won't give the same performance but still makes it possible to pass a wider variety of distributions in nopython mode.

danielcsaba avatar Dec 15 '16 23:12 danielcsaba