awkward
awkward copied to clipboard
Mysterious error in ak.drop_none
Version of Awkward Array
HEAD
Description and code to reproduce
This might be an unusual case, but it shouldn't raise this error. I haven't looked into it; I'm just logging it for future research.
array = ak.Array(
ak.contents.ListArray(
ak.index.Index64(np.array([0, 4, 8])),
ak.index.Index64(np.array([3, 5, 12])),
ak.contents.ByteMaskedArray(
ak.index.Index8(np.array([0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0])),
ak.contents.NumpyArray(
np.array([1, 1, 0, -1, 1, -1, -1, -1, 0, 0, 0, 0])
),
valid_when=False,
),
),
check_valid=True,
)
>>> array
<Array [[1, 1, 0], [1], [0, 0, 0, 0]] type='3 * var * ?int64'>
>>> ak.drop_none(array)
<Array [[1, 1, 0], [1], [0, 0, 0, 0]] type='3 * var * int64'>
>>> ak.drop_none(array, axis=-1)
Traceback (most recent call last):
File "/home/jpivarski/irishep/awkward/src/awkward/_dispatch.py", line 62, in dispatch
next(gen_or_result)
File "/home/jpivarski/irishep/awkward/src/awkward/operations/ak_drop_none.py", line 56, in drop_none
return _impl(array, axis, highlevel, behavior, attrs)
File "/home/jpivarski/irishep/awkward/src/awkward/operations/ak_drop_none.py", line 121, in _impl
out = ak._do.recursively_apply(out, recompute_offsets, depth_context=options)
File "/home/jpivarski/irishep/awkward/src/awkward/_do.py", line 36, in recursively_apply
return layout._recursively_apply(
File "/home/jpivarski/irishep/awkward/src/awkward/contents/listarray.py", line 1588, in _recursively_apply
result = action(
File "/home/jpivarski/irishep/awkward/src/awkward/operations/ak_drop_none.py", line 94, in recompute_offsets
out = layout._rebuild_without_nones(none_indexes, layout.content)
File "/home/jpivarski/irishep/awkward/src/awkward/contents/listarray.py", line 1530, in _rebuild_without_nones
return self.to_ListOffsetArray64()._rebuild_without_nones(
File "/home/jpivarski/irishep/awkward/src/awkward/contents/listarray.py", line 302, in to_ListOffsetArray64
return self._broadcast_tooffsets64(offsets)
File "/home/jpivarski/irishep/awkward/src/awkward/contents/listarray.py", line 429, in _broadcast_tooffsets64
self._backend.maybe_kernel_error(
File "/home/jpivarski/irishep/awkward/src/awkward/_backends/backend.py", line 67, in maybe_kernel_error
raise ValueError(self.format_kernel_error(error))
ValueError: stops[i] > len(content) while attempting to get index 12 (in compiled code: https://github.com/scikit-hep/awkward/blob/awkward-cpp-26/awkward-cpp/src/cpu-kernels/awkward_ListArray_broadcast_tooffsets.cpp#L20)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/jpivarski/irishep/awkward/src/awkward/_dispatch.py", line 38, in dispatch
with OperationErrorContext(name, args, kwargs):
File "/home/jpivarski/irishep/awkward/src/awkward/_errors.py", line 85, in __exit__
self.handle_exception(exception_type, exception_value)
File "/home/jpivarski/irishep/awkward/src/awkward/_errors.py", line 95, in handle_exception
raise self.decorate_exception(cls, exception)
ValueError: stops[i] > len(content) while attempting to get index 12 (in compiled code: https://github.com/scikit-hep/awkward/blob/awkward-cpp-26/awkward-cpp/src/cpu-kernels/awkward_ListArray_broadcast_tooffsets.cpp#L20)
This error occurred while calling
ak.drop_none(
<Array [[1, 1, 0], [1], [0, 0, 0, 0]] type='3 * var * ?int64'>
axis = -1
)
I've started working on this, but it's tricky to reason about the various ways this function can act. This is not a localised problem; reasoning about axes is always tricky. I'm just dropping my working thoughts for now:
As I currently understand it, axis=X
is orthogonal to record structure, so we should not treat var * [x * ..., y * ...]
differently to var * x * ...
. Yet, we do not permit axis=0
for record arrays because it may shift the field structure if x
and y
do not have missing values in the same place.
We need to act in two places: at axis == depth
to fix lists, and axis == depth - 1
to drop the missing values. The following content types need to be considered in the latter case:
option-like = option | record[option-like] | union[option-like]
Right now we don't consider all of these possible branches.