sparse Poor numerical consistency with numpy for reductions with dtype=`float16` specified

Reductions like COO.sum and COO.mean fail to match numpy in all cases when dtype=numpy.float16 is specified. For example:

import sparse
import numpy as np
x = np.array([[[    0,     0,  4526,     0],
               [    0,     0,   -37,     0],
               [ 8372,     0,  7915,     0]],
              [[    0,     0,     0,     0],
               [    0,     0, -7917,     0],
               [-9719,     0,     0,     0]]], dtype='i4')

s = sparse.COO.from_numpy(x)
res = s.sum(axis=(0, 2), dtype='f2')
sol = x.sum(axis=(0, 2), dtype='f2')
print(res.todense())
print(sol)

outputs:

[ 4530. -7950.  6564.]
[ 4530. -7950.  6570.]

It's not clear if each of these will need to be fixed per-method, or if there's a general fix in the reduce code. For sum, numpy interprets x.sum(dtype='f2') the same as x.astype('f2').sum(), but this is not true for x.mean(dtype='f2').

Sep 19 '18 20:09 jcrist

See discussion in #190.

Sep 19 '18 20:09 jcrist

At these scales, it could still be numerical inaccuracy. Recall that IEEE floating point standards dictate that 16-bit floats have only 10 bits of mantissa. Remove one bit which gives you only 9 bits of relative accuracy, which translates to roughly an accuracy of 1 in 512 per operation. In your case, we're stacking 6 operations, and the relative error is much less than 1 in 512... So I still think this can be put down to numerical errors. We just have to scale rtol, atol with the maximum value in the array. :-)

Sep 19 '18 21:09 hameerabbasi

Another issue here is that you’re putting values that are O(10000) in something that has a mantissa of 6 bits... Which means the greatest value it can hold is 64-ish.

Oct 11 '18 20:10 hameerabbasi

I'm not saying doing reductions with float16 makes sense, I'm just saying that I don't think what we're doing 100% matches semantically what numpy is doing. By semantically, I mean when is the dtype='f2' used - for some reduction it might be interpreted as do the reduction in its original type and then cast the result, or cast the original data then do the reduction, or cast some intermediate result and finish the reduction, etc.... If you read through the numpy code, often dtype='f2' is special cased, with a different algorithmic path. If we're matching numpy's semantic behavior but get different numbers due to floating point errors, then that's fine, but I'm not convinced that's the case for all reductions here.

Oct 12 '18 15:10 jcrist

I believe this is a good candidate to ask the NumPy mailing list about. From what I saw of the np.sum method, it just calls np.add.reduce. It might be that ufunc.reduce path has some special cases like the ones you describe.

Oct 12 '18 15:10 hameerabbasi