code accessing numpy array elements slower on cinder (jit on or off)

Open belm0 opened this issue 4 years ago • 1 comments

The example below is fairly straightforward, and gets about 20x speedup under numba. If I understand correctly, numba achieves that by accessing the numpy C API directly from jitted code, so the Python VM isn't involved. (I don't expect cinder jit to do that.)

Still, cinder (no jit) is 50% slower on this code than stock CPython (not using numba). And with cinder jit enabled on this function exclusively, there is no improvement over the no-jit case. (I confirmed that the function is compiled.)

What makes cinder slower than stock Python when touching numpy arrays?

def masked_mean(array: numpy.ndarray):
    n_rows, n_cols = array.shape
    mean = [0.0] * n_cols

    for j in range(n_cols):
        sum_ = 0.0
        n = 0
        for i in range(n_rows):
            val = array[i, j]
            if val > 0.0:
                sum_ += val
                n += 1
        mean[j] = sum_ / n if n > 0 else -1.0

    return mean

May 28 '21 09:05 belm0

Good question. I don't know off the top of my head, but if you're interested could you try running with the Linux perf tool? With such a big discrepancy I'd hope the root-cause would be fairly obvious when comparing a perf report from Cinder and stock CPython.

Jun 03 '21 21:06 jbower-fb