numpy-groupies
numpy-groupies copied to clipboard
Trapz implementation for numba
Hi,
I'm working on a numba implementation of np.trapz for numpy-groupies. This seems to work for my purposes – might this be useful to anyone else?
Thanks for the great package!
Tested with:
import numpy as np
from numpy_groupies.aggregate_numba import aggregate
import numpy as np
x = np.logspace(0, 1, 100000)
y = np.random.uniform(size=len(x))
n_groups_max = 6
group_idx = np.sort(np.random.randint(low=0, high=n_groups_max, size=len(x)))
res = []
for group in np.arange(0, n_groups_max):
if np.count_nonzero(group_idx == group) > 0:
res.append(np.trapz(y[group_idx == group], x=x[group_idx == group]))
else:
res.append(0)
res_agg = aggregate(group_idx, y, x=x, func='trapz')
print('numpy', res)
print('groupies', res_agg)
np.testing.assert_allclose(res, res_agg)
which yields:
numpy [0.23125231248855868, 0.33806806909519105, 0.5028857676085916, 0.735241454082436, 1.0823459235149786, 1.5971309867232373]
groupies [0.23125231 0.33806807 0.50288577 0.73524145 1.08234592 1.59713099]
Thanks @bmorris3
I'm not a numba expert so can't really review this. However, could you add some tests to test_generic.py please?
Hi @bmorris3 ! Sorry for the late reaction. I recently reviewed the code. Unfortunately the suggested loop for trapz within the numba implementation assumes, that all the group members of one group are consecutive within group_idx. This in general then requires a sort step in advance. While it would be possible to add that, it is not how all other functions work. The sorting is slow, and not requiring it, is what makes all the other functions in numpy-groupies fast. Also for trapz it will be possible, to implement it in line with the other implementations, so that the sort step can be omitted. I'll probably have a look on that myself soon.