bottleneck icon indicating copy to clipboard operation
bottleneck copied to clipboard

reducing to scalar upcasts np.float32 to float

Open fujiisoup opened this issue 6 years ago • 3 comments

docs says The data type (dtype) of the output is the same as the input. On 64-bit operating systems, 32-bit input is NOT upcast to 64-bit accumulator and return values.

but it looks the current bottleneck (1.2.1 on 64-bit system) upcasts np.float32 to float if an axis keyword is not specified.

In [1]: import numpy as np

In [2]: import bottleneck as bn

In [3]: a = np.random.randn(1000, 1000).astype(np.float32)

In [7]: type(bn.nansum(a)), bn.nansum(a)
Out[7]: (float, 2183.470703125)

In [8]: type(np.nansum(a)), np.nansum(a)
Out[8]: (numpy.float32, 2183.4875)

If we specify axis, bottleneck works fine

In [12]: bn.nansum(a, axis=0).dtype
Out[12]: dtype('float32')

In [13]: np.nansum(a, axis=0).dtype
Out[13]: dtype('float32')

In [14]: (bn.nansum(a, axis=0) == np.nansum(a, axis=0)).all()
Out[14]: True

From pydata/xarray#1989

fujiisoup avatar Mar 15 '18 01:03 fujiisoup

Good catch.

The accumulator is 32-bits but the output is a python float. You can get the same result even when you do specify the axis. The issue is whether or not the array is completely reduced. If it is then the last line of the function (in C) is return PyFloat_FromDouble(asum);

In previous versions of bottleneck I did np.float32(asum) (using cython). But that was very slow because I didn't (and don't) know the equivalent numpy C-API call.

Unless there is a simple fix I am inclined to change the documentation.

kwgoodman avatar Mar 15 '18 16:03 kwgoodman

Thanks @kwgoodman .

If it is then the last line of the function (in C) is return PyFloat_FromDouble(asum);

Some leading digits looks different from np.nansum(a) and also from np.nansum(a.astype(float)).

In [16]: print(np.nansum(a))
-86.4169

In [17]: print(np.nansum(a.astype(float)))
-86.4171186548

In [18]: print(bn.nansum(a))
-86.41963195800781

Is it only an issue of the final upcasting?

I didn't (and don't) know the equivalent numpy C-API call.

Neither to me. From outsider's perspective, can it be done by storing the result in a 1-element array and then indexing it in python?

Unless there is a simple fix I am inclined to change the documentation.

I think it would be also OK for me. Thanks

fujiisoup avatar Mar 15 '18 23:03 fujiisoup

I think numpy uses a more robust algorithm: https://en.wikipedia.org/wiki/Pairwise_summation. Bottleneck doesn't.

kwgoodman avatar Mar 16 '18 16:03 kwgoodman