hist
hist copied to clipboard
[BUG] `hist.numpy.histogram` from array with NaN entries
Describe the bug
When giving an array of values to hist.numpy.histogram
that contains NaN values, the histogram creation fails:
>>> import hist
>>> import numpy as np
>>> hist.numpy.histogram([1, 2, np.nan])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "[...]/lib/python3.9/site-packages/boost_histogram/numpy.py", line 166, in histogram
result = histogramdd(
File "[...]/lib/python3.9/site-packages/boost_histogram/numpy.py", line 78, in histogramdd
cpp_ax = _core.axis.regular_numpy(b, r[0], r[1])
ValueError: forward transform of start or stop invalid
I assume that the challenge is the automatic detection of a suitable bin range, since this does not work with pure numpy
either:
>>> np.histogram([1, 2, np.nan])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<__array_function__ internals>", line 180, in histogram
File "[...]/lib/python3.9/site-packages/numpy/lib/histograms.py", line 793, in histogram
bin_edges, uniform_bins = _get_bin_edges(a, bins, range, weights)
File "[...]/lib/python3.9/site-packages/numpy/lib/histograms.py", line 426, in _get_bin_edges
first_edge, last_edge = _get_outer_edges(a, range)
File "[...]/lib/python3.9/site-packages/numpy/lib/histograms.py", line 323, in _get_outer_edges
raise ValueError(
ValueError: autodetected range of [nan, nan] is not finite
Perhaps this is more of a feature request than a bug, but I think it would be useful to automatically mask out NaN values so that histogramming could still work (perhaps with an accompanying warning), or otherwise to catch this with a more descriptive error message.
This came up via uproot-browser
, and it is the issue I ran into that prompted https://github.com/henryiii/uproot-browser/issues/29.
Steps to reproduce
see above, Python 3.9.10, relevant library versions:
boost-histogram 1.3.1
hist 2.6.1
numpy 1.22.2