awkward icon indicating copy to clipboard operation
awkward copied to clipboard

Fix: Lengths of empty regular slices

Open ioanaif opened this issue 2 years ago • 2 comments

#1557

ioanaif avatar Jul 26 '22 19:07 ioanaif

Codecov Report

Merging #1568 (d47b7e8) into main (9e17f29) will increase coverage by 0.00%. The diff coverage is 31.42%.

Impacted Files Coverage Δ
src/awkward/_v2/_connect/cuda/__init__.py 0.00% <0.00%> (ø)
src/awkward/_v2/_connect/numexpr.py 88.40% <0.00%> (ø)
src/awkward/_v2/_connect/pyarrow.py 88.46% <0.00%> (ø)
src/awkward/_v2/contents/bytemaskedarray.py 88.82% <0.00%> (ø)
src/awkward/_v2/contents/indexedarray.py 73.83% <0.00%> (ø)
src/awkward/_v2/contents/indexedoptionarray.py 89.14% <0.00%> (ø)
src/awkward/_v2/contents/listoffsetarray.py 81.85% <0.00%> (ø)
src/awkward/_v2/contents/unionarray.py 86.27% <0.00%> (ø)
src/awkward/_v2/highlevel.py 71.01% <ø> (+0.24%) :arrow_up:
src/awkward/_v2/numba.py 93.47% <0.00%> (ø)
... and 11 more

codecov[bot] avatar Jul 26 '22 19:07 codecov[bot]

What if the sliced dimension is not first?

Good point, I assumed in an empty slice that it would go first.

Why is start is None or stop is None the condition for determining whether there should be a RegularArray?

This was the pattern I found from the tests, the zeros_length needs to be adjusted only when a slice with this properties was passed

The way to find out is to see what NumPy does: fortunately this test doesn't depend on dimensions being irregular. You can make a NumPy array with a lot of dimensions and try slices on it with various combinations of : and [0, 1, 2], and [] to see what happens to the output shape.

I noticed this behaviour is only consistent when slice.start/slice.stop is None, here are some of the NumPy tests I did:

>>> d = np.arange(3 * 3 * 2).reshape(3,3,2)
>>> e = ak._v2.contents.NumpyArray(d)
>>> c1 = np.array([], np.int64)
>>> ak._v2.to_list(d[[2],[1],c1]) == ak._v2.to_list(e[[2],[1],c1]) == []
True
>>> d = np.arange(1 * 3 * 3 * 2).reshape(1,3,3,2)
>>> e = ak._v2.contents.NumpyArray(d)
>>> ak._v2.to_list(d[c1,c1]) == ak._v2.to_list(e[c1,c1]) == []
True
>>> d = np.arange(2 * 3).reshape(2, 3)
>>> e = ak._v2.contents.NumpyArray(d)
>>> ak._v2.to_list(d[:,[]]) == ak._v2.to_list(b[:,[]]) == [[], []]
True 
>>> ak._v2.to_list(d[1:,[]]) == ak._v2.to_list(e[1:,[]]) == [[]]
True
>>> ak._v2.to_list(d[:-2,[]]) == ak._v2.to_list(e[:-2,[]]) == []
True
>>> ak._v2.to_list(d[:-1,[]]) == ak._v2.to_list(e[:-1,[]]) == [[]]
True

ioanaif avatar Jul 28 '22 13:07 ioanaif