awkward icon indicating copy to clipboard operation
awkward copied to clipboard

ak.stack, ak.unstack

Open nsmith- opened this issue 5 years ago • 2 comments

Proposing a new structure operation akin to pandas' stack and unstack which pivots a record structure to a jagged array and vice versa. The main difference is that we do not have a non-trivial row index, so the ak.unstack operation would create tuple RecordArray rather than labeled columns, as is done for structure operations like ak.cross. Similarly, it might make sense to impose that ak.unstack will only operate on a tuple RecordArray. As in pandas, the default axis (level) should probably be -1 for this operation.

Examples:

a = ak.Array([[(1, 2), (1, 3), (2, 3)], [], [(4, 5), (6, 7, 8)]])
assert ak.tolist(a.stack()) == [[[1, 2], [1, 3], [2, 3]], [], [[4, 5], [6, 7, 8]]]
assert ak.tolist(a.stack(dropna=False)) == [[[1, 2, None], [1, 3, None], [2, 3, None]], [], [[4, 5, None], [6, 7, 8]]]

a = ak.Array([[[1, 2], [1, 3], [2, 3]], [], [[4, 5], [6]]])
assert ak.tolist(a.unstack()) == [[(1, 2), (1, 3), (2, 3)], [], [(4, 5), (6, None)]]

a = ak.Array([{'x': (2, 3), 'y': 1}, {'x': (4, 5), 'y': 0}])
assert ak.tolist(a.stack()) == [{'x': [2, 3], 'y': [[0, 1], [2, 3]]}, {'x': [4, 5], 'y': [[0, 1], [2, 3]]}]

# for
a = ak.zip({'x': [[1, 2, 3], [4, 5, 6, 7]], 'y': [[1, 2, 3], [4, 5, 6, 7]]})
# the following
b = ak.choose(a, 3).i0 + ak.choose(a, 3).i1 + ak.choose(a, 3).i2
# could be written
b = ak.sum(ak.stack(ak.choose(a, 3)), -1)

nsmith- avatar Apr 02 '20 19:04 nsmith-

I'm not sure if you've been seeing this, but it was asked for.

https://github.com/scikit-hep/awkward-1.0/blob/docs/0198-tutorial-documentation-1/studies/how-to-questions-survey.md

Thanks for making an issue!

jpivarski avatar Apr 02 '20 20:04 jpivarski

Interestingly, ak.concatenate with axis=-1 would be implementable with ak.stack(ak.zip([a, b, c])). For other axes, I think the more generalized pandas pivot method would need to be implemented: it can map an arbitrary set column index levels to new row index levels.

nsmith- avatar Apr 03 '20 19:04 nsmith-