boost-histogram icon indicating copy to clipboard operation
boost-histogram copied to clipboard

[BUG] Failure to conver awkward Masked array to numpy

Open MoAly98 opened this issue 3 years ago • 3 comments

I have a masked awkward array that I am trying to pass to Hist.fill(). The inner conversion to numpy arrays seems to fail because the allow_missing flag in to_numpy() is set to False. From the error log, it seems that this is set to false by hard-coding somewhere in the AwkwardArray source code. However, I do not run into this problem if I simply call to_numpy() on my array before passing it to Hist.fill() -- i.e. doing the conversion explicitly. The following code re-produces the problem:

 arr2 = ak.Array([1,2,3,4])
arr2m = ak.mask(arr2, arr2>1)
>>> [None, 2, 3, 4]
import boost_histogram as bh
bh.axis.Regular(5,0,1)
>>> Regular(5, 0, 1)
ax = bh.axis.Regular(5,0,1)
histo = bh.Histogram(ax)
histo.fill(arr2m)
>>>
  File "<stdin>", line 1, in <module>
  File "/opt/anaconda3/envs/pythium/lib/python3.9/site-packages/boost_histogram/_internal/hist.py", line 467, in fill
    args_ars = _fill_cast(args)
  File "/opt/anaconda3/envs/pythium/lib/python3.9/site-packages/boost_histogram/_internal/hist.py", line 78, in _fill_cast
    return tuple(_fill_cast(a, inner=True) for a in value)  # type: ignore
  File "/opt/anaconda3/envs/pythium/lib/python3.9/site-packages/boost_histogram/_internal/hist.py", line 78, in <genexpr>
    return tuple(_fill_cast(a, inner=True) for a in value)  # type: ignore
  File "/opt/anaconda3/envs/pythium/lib/python3.9/site-packages/boost_histogram/_internal/hist.py", line 80, in _fill_cast
    return np.asarray(value)
  File "/opt/anaconda3/envs/pythium/lib/python3.9/site-packages/awkward/highlevel.py", line 1358, in __array__
    return ak._connect._numpy.convert_to_array(self.layout, args, kwargs)
  File "/opt/anaconda3/envs/pythium/lib/python3.9/site-packages/awkward/_connect/_numpy.py", line 15, in convert_to_array
    out = ak.operations.convert.to_numpy(layout, allow_missing=False)
  File "/opt/anaconda3/envs/pythium/lib/python3.9/site-packages/awkward/operations/convert.py", line 312, in to_numpy
    raise ValueError(
ValueError: ak.to_numpy cannot convert 'None' values to np.ma.MaskedArray unless the 'allow_missing' parameter is set to True

(https://github.com/scikit-hep/awkward-1.0/blob/1.7.0/src/awkward/operations/convert.py#L316)

As far as I can see, there is no option to set allow_missing = True in the fill() funciton call, so I wonder if this restriction is intentional or a bug?

Thank you very much.

MoAly98 avatar Feb 14 '22 15:02 MoAly98

We currently don't really support directly using awkward arrays. I'd like to, and it's planned, but calling to_numpy first is correct. And even if we support it, we'd basically be doing exactly that; we require dense regular arrays to fill. That might be a restriction we can relax a bit after https://github.com/boostorg/histogram/pull/364 which sounds exciting. Ideally I'd like full C++23 ndspan support. ;)

(still won't help with masking, though, but would at least remove the "dense" conversion requirement)

henryiii avatar Sep 17 '22 15:09 henryiii

https://github.com/boostorg/histogram/pull/364 is not really related to this. I am just making some internal tool easier to use, this does not have an external effect.

HDembinski avatar Sep 17 '22 15:09 HDembinski

I think it would be very useful to support filling with awkward arrays if that can be achieved without introducing a dependency on awkward. fill() could check whether to_numpy() exists and call it.

HDembinski avatar Sep 17 '22 15:09 HDembinski