bottleneck icon indicating copy to clipboard operation
bottleneck copied to clipboard

[BUG] Numerical inaccuracy in summation based routines

Open krachyon opened this issue 3 years ago • 5 comments

Describe the bug

Bottleneck's implementation of algorithms containing a summation yields different results than numpy for floats. This seems to stem from the fact that numpy uses some sort of compensated summation algorithm to increase accuracy, while bottleneck uses a straight sum, e.g.: bottleneck/src/reduce_template.c

FOR {
    const npy_DTYPE0 ai = AI(DTYPE0);
    if (!bn_isnan(ai)) {
        asum += ai;
    }

To Reproduce

import numpy as np
import bottleneck

# adding float32.eps to 2.f gives 2.f so e.g. Kahan-summation is needed to get result != 2.f
arr = np.hstack(([np.float32(2.)], np.repeat(np.finfo(np.float32).eps, 100000).astype(np.float32)))
print('numpy: ', np.nansum(arr))
print('bottleneck: ', bottleneck.nansum(arr))
numpy:  2.011919
bottleneck:  2.0

System: Linux-5.11.11-arch1-1-x86_64-with-glibc2.33 Python 3.9.2 (default, Feb 20 2021, 18:40:11) [GCC 10.2.0] bottleneck 1.3.2

Expected behavior As implementations can be switched due to non-obvious reasons (like a fallback to numpy routines in the case of non-native byteorder), results between bottleneck-routines and numpy should match. If a complete match of results is not attainable, the documentation should state clearly that bottleneck does not always reproduce numpy results.

Additional context https://github.com/astropy/astropy/issues/11492

krachyon avatar Apr 08 '21 10:04 krachyon

If I draw up a PR with Kahan summation, does it have a chance of being accepted? Or will bottleneck refuse to take a performance hit for the sake of precision?

sebasv avatar Aug 04 '21 19:08 sebasv

@sebasv Sorry for the lack of response here, I've had significantly less bandwidth this year.

I believe it's possible to match numpy's output while also being faster. I have some local commits that are unfinished that accomplish part of this - biggest issue is that I would want to fix this for every function in one release.

qwhelan avatar Aug 04 '21 20:08 qwhelan

Thank you for the quick response! Let me know if I can help in some capacity.

sebasv avatar Aug 05 '21 11:08 sebasv

If I draw up a PR with Kahan summation, does it have a chance of being accepted? Or will bottleneck refuse to take a performance hit for the sake of precision?

As you seem to have put a little more work into this than I did, do you happen to know if numpy uses Kahan, pairwise summation or something completely different ? I couldn't really follow the dispatch logic...

krachyon avatar Aug 06 '21 22:08 krachyon

I believe now that Numpy uses pairwise summation (see numpy/numpy#3685). Naive summation has a O(n) error, pairwise has an O(log(n)) error and Kahan has an O(1) error. With a large base case, Naive and pairwise have equivalent speed (just minimal recursion overhead). Kahan requires about 4 times the number of additions. So perhaps pairwise is the best fit for Bottleneck?

sebasv avatar Aug 07 '21 07:08 sebasv