bottleneck icon indicating copy to clipboard operation
bottleneck copied to clipboard

Enhancement Request: Numba Integration

Open thor-kiessling opened this issue 4 years ago • 2 comments

Hello folks, I am thoroughly impressed with the speedups and conciseness of this library.

My request is for this enhancement is to type the functions in the library so they are callable from a @numba.jit(nopython=True) context. Or maybe just reflect the numpy type of what it is called with, I'm not sure of the details. Sorry for being vague! There might also be a lot of user error here. Here are the errors I get when trying to drop the functions into an existing numba function.

from bottleneck import move_mean results in the error

Untyped global name 'move_mean': cannot determine Numba type of <class 'builtin_function_or_method'>

import bottleneck as bn and calling bn.move_mean(data, window=window) results in the error

numba.errors.TypingError: Failed in nopython mode pipeline (step: Handle with contexts)
Unknown attribute 'move_mean' of type Module(<module 'bottleneck' from '/home/username/.pyenv/versions/3.6.6/lib/python3.6/site-packages/bottleneck/__init__.py'>)

My installed version: bn.bench() Bottleneck 1.3.2; Numpy 1.18.1 Numba version 0.48.0

edit: I read #92 and found https://github.com/shoyer/numbagg, I'll be checking that out as well.

Further testing with my crappy homebrew EMA function vs bn.movemean and numbagg.move_mean, times at the end in milliseconds. Mine/bn/numbagg

done with rolling avgs	2020-03-09 20:00:22.221672 38.096
done with r avgs bn  	2020-03-09 20:00:22.225331 3.63
done with r avgs gg  	2020-03-09 20:00:22.700702 475.352

thor-kiessling avatar Mar 10 '20 01:03 thor-kiessling

@thor-kiessling Thanks for opening this issue! I've admittedly not used numba recently - could you provide a few example snippets you think should work and I'll try to debug from there?

qwhelan avatar Mar 11 '20 02:03 qwhelan

Hello, here is a minimal example. it defines an inefficient EMA/move mean and tries to call it and then bottleneck to do the same thing. Timing can be inserted in or around test_func

import numba as nb
import bottleneck as bn
import numpy as np
from numba import uint64, int64


a = np.array(range(10000000), dtype=np.float64)

@nb.jit("(f8[:])(f8[:], i8)", nopython=True, nogil=True, cache=True, parallel=False, fastmath=True)  
def self_move_mean(arr, window_size):  # this is an SMA.
    arr_length = uint64(arr.shape[0])
    out_arr = np.zeros(arr_length)
    window_size = int64(window_size)
    ones = np.ones(int(window_size) - 1, dtype=np.float64)
    new_arr = np.hstack((ones * arr[0], arr))

    for i in range(window_size):
        out_arr[i] = np.mean(new_arr[i: i + int(window_size)])
    return out_arr

@nb.njit()
def test_func(a):
    b = self_move_mean(a, 3)
    c = bn.move_mean(a, window=3, min_count=1)
    return b, c

b, c = test_func(a)

I'm not sure of other examples I'd like to have work.

thor-kiessling avatar Mar 11 '20 02:03 thor-kiessling