uncertainties
uncertainties copied to clipboard
numpy.nanmean() does not skip nan±… or …±nan
Hello!
First of all, great piece of work! It's saving me a lot of time :)
I'm having issues with numpy.nanmean
that should ignore nan
values when calculating the mean.
Here some test code:
from uncertainties import unumpy
import numpy as np
v = np.arange(16,dtype=np.float64)
e = np.sqrt(v)
v[1:3] = np.nan
print(v)
print(np.isnan(v[1:3]))
un = unumpy.uarray(v,e)
print(un)
print(un.mean())
print(np.nanmean(un))
print(v.mean())
print(np.nanmean(v))
Here the output:
[ 0. nan nan 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.
15.]
[ True True]
[0.0+/-0 nan+/-1.0 nan+/-1.4142135623730951 3.0+/-1.7320508075688772
4.0+/-2.0 5.0+/-2.23606797749979 6.0+/-2.449489742783178
7.0+/-2.6457513110645907 8.0+/-2.8284271247461903 9.0+/-3.0
10.0+/-3.1622776601683795 11.0+/-3.3166247903554 12.0+/-3.4641016151377544
13.0+/-3.605551275463989 14.0+/-3.7416573867739413
15.0+/-3.872983346207417]
nan+/-0.6846531968814576
nan+/-0.6846531968814576
nan
8.35714285714
From the output, you can see that both mean
and nanmean
are returning nan+/-error
. I'd say that the later should return the mean ignoring the nan
values.
I hope you can help with that! Thanks
Thanks.
Strictly speaking, this is the expected behavior: nan±…
is not nan
, and NumPy skips nan
(only).
Now, unumpy.isnan()
works as you want and could be used as a mask, or for boolean indexing.
I will check whether there is any way to make NumPy understand that nan±…
should be treated like nan
by nanmean()
.
Wouldn't it be preferable to make ufloat(np.nan, 2)
return a np.nan
directly? As nan+/-2.0
doesn't really make sense anyway (same as 2.0+/-nan
)?
The general idea of never producing nan±…
but producing nan
instead seems reasonable, since we have basically no information on the number (with uncertainty) in question. Implementing this goes beyond changing the creation of nan±…
with ufloat()
, as they are many other ways of creating a number with uncertainty. I guess that this is quite doable, though. So, something to be implemented, probably.
±inf±…
seems like it could be handled in a similar way.
Now, I would have to think about 2±nan
a bit more: the nominal value is still relevant (it is the same as in a calculation with uncertainty), and the nan
just shows that calculating the uncertainty with linear error propagation theory does not give a good result. The mean of numbers that include this one could thus have a relevant nominal value (with an uncertainty of nan
that indicates that the uncertainty is not to be trusted, which is an important piece of information, that does not invalidate the relevance of the nominal value).
First, athanks a lot for this extremely useful module!
I have just been playing around with this, and discovered that if I convert all occurrences of nan+/-nan
to simply be NaN
, and then run np.nanmean()
, I get values of nan+/-23.4
etc.
So apparently, there is no way to do a nanmean
with uncertainties...?
Thanks!
It is actually possible to a NaN-mean even when you are using uncertainties. With
>>> import uncertainties as unc
>>> from uncertainties import unumpy
>>> import numpy as np
>>> nan = float("nan")
>>> arr = np.array([nan, unc.ufloat(nan, 1), unc.ufloat(1, nan), 2])
>>> arr
array([nan, nan+/-1.0, 1.0+/-nan, 2], dtype=object)
you can get the NaN-mean by selecting only the values with a non-NaN nominal value:
>>> arr[~unumpy.isnan(arr)].mean()
1.5+/-nan
or more directly by asking NumPy to skip them:
>>> np.ma.array(arr, mask=unumpy.isnan(arr))
masked_array(data=[--, --, 1.0+/-nan, 2],
mask=[ True, True, False, False],
fill_value='?',
dtype=object)
>>> _.mean()
1.5+/-nan
In this case the uncertainty is NaN as it should be, because one of the numbers does have an undefined uncertainty, which makes the final uncertainty undefined (but not the average). In general, uncertainties are not NaN and you obtain the mean of the non-NaN values.
(Edited so as to reflect the fact that the uncertainties module already provides uncertainties.umath.isnan()
and uncertainties.unumpy.isnan()
.
PS: I added all the information (and more) from my post above to the documentation: http://uncertainties-python-package.readthedocs.io/en/latest/genindex.html#N. Thank you for your feedback!