rvlib
rvlib copied to clipboard
Poor performance in Jitted loops
@danielcsaba @spencerlyon2
I'm getting poor performance here in exactly the kind of scenario where I would like to use RVlib:
https://gist.github.com/anonymous/c3387ab7408609a49472c3020517bbd6
Any ideas on what's happening?
Hi @jstac,
I had a quick look and it's failing in nopython mode---seemingly due to a type clash. My first attempts at fixing it were not successful. I'll try and look into it in more detail.
I don't have the full sulution right away but there are two comments that can be made:
- nopython mode doesn't work because the class is defined out of the jitted loop. Moving its definition inside the function (or passing it as argument solves this problem. I don't know whether it is a long run limitation of numba.
f.rand(1)returns a newly allocated 1d array, so that one should dof.rand(1)[0]to get the associated scalar. In terms of API, one might expectf.rand()to return a scalar ? The following code takes 2s, vs 0.5s for the other options. I would hazard the remaining gap might come from the aforementioned array creation and maybe some overhead associated to objects:
@jit(nopython=True)
def ar1_sample_mean(N, alpha, beta, s):
f = Normal(0, 1)
x = beta / (1 - alpha)
sm = 0.0
for i in range(N):
x = beta + alpha * x + s * f.rand(1)[0]
sm += x
return sm / N
@albop Thanks. I tried and got the same result. That's quite reassuring.... But still not as quick as with np.random.randn(), as you say.
It's not great to have this fragility, but perhaps OK if we give good documentation and examples.
I think f.rand() should return a scalar, and I wonder if that wouldn't make a difference (i.e., improve the speed a bit more)
hi @jstac and @albop
thanks Pablo for pointing out the issue.
regarding the f.rand(), our problem was that jitclass was not able to handle optional and default arguments so we had to pass the number as an argument no matter its value.
I reckon the np.random.randn is the best integrated with numba. The jitclass won't give the same performance but still makes it possible to pass a wider variety of distributions in nopython mode.