awkward
awkward copied to clipboard
Lengths of empty regular slices (in v2)
Version of Awkward Array
HEAD
Description and code to reproduce
At least for v2 and variable-length arrays, we get the right zero-slicing edge case:
>>> import awkward._v2 as ak
>>> ak.Array([[1, 2, 3], [4, 5]])[:, []]
<Array [[], []] type='2 * var * int64'>
but v2 regular arrays have a zero-length error:
>>> ak.to_regular(ak.Array([[1, 2, 3], [4, 5, 6]]), axis=1)[:, []]
<Array [] type='0 * 3 * int64'>
What we want to have happen is for this slice to build a RegularArray with an explicit zeros_length
argument, which is the only way to make a RegularArray with non-zero length yet contain lists of zero length:
>>> ak.Array(
... ak.contents.RegularArray(
... ak.contents.NumpyArray(np.arange(1, 7)),
... size=0,
... zeros_length=2,
... )
... )
<Array [[], []] type='2 * 0 * int64'>
Pointed out by @grst.
v1 arrays are also incorrect, but this is an edge-case bug that only really needs to get fixed in v2. (v1 has only 4 more months left...)
@grst, since you say that this is a blocker, I moved it up in the priority queue. Normally, an error about "What is the exact type of an array that doesn't contain any data?" would not be a high priority, but presumably it is for you because you need to make assumptions about that type to fit it into AnnData.
We have some tests for those edge cases that fail currently. So while it blocks merging the PR, it does not block continuing development.
Thanks for looking into this, but no hurries! I am on vacation from tomorrow on and @giovp also said he currently doesn't have time to focus on the AnnData PR.
I found another slicing edge case that is not fixed by the linked PR yet. I don't know if it's related or a separate issue, though:
Expected, numpy behaviour
np1 = np.ones((5, 7))
np1[:, []]
# array([], shape=(5, 0), dtype=float64)
np1[[], :]
# array([], shape=(0, 7), dtype=float64)
akward array behaviour
a1 = ak.Array(np.ones((5, 7)))
a1[:, []]
# <Array [] type='0 * 7 * float64'>
a1[[], :]
# <Array [] type='0 * 7 * float64'>
Version:
v2 API, package installed with pip install git+https://github.com/scikit-hep/awkward/@ioanaif/fix-lengths-of-empty-regular-slices-1557
Hi! I just added this corner-case in the tests for the linked PR and it successfully passed. All empty slice cases should be covered now.
I confirm this works with your branch @ioanaif! I had run pip without --force-reinstall
, so I actually tested against the old version before.