geoarrow-rs icon indicating copy to clipboard operation
geoarrow-rs copied to clipboard

Python: Rechunk/Take by passing chunked array into `__getitem__`

Open kylebarron opened this issue 1 year ago • 1 comments

E.g. you should be able to do

import pyarrow as pa

chunk1 = pa.array([1, 2, 3])
chunk2 = pa.array([4, 5, 6])
chunked_array = pa.chunked_array([chunk1, chunk2])
rechunked = geo_table[chunked_array]
# output table has two chunks with three rows each

kylebarron avatar Jan 24 '24 01:01 kylebarron

It should maybe also be possible to pass in a list of slices?

geo_table[0:3, 3:6]

though I'm not overly excited about that syntax because it feels too much like array slicing

kylebarron avatar Jan 24 '24 01:01 kylebarron

I don't like this API. It should be clear with methods. Regardless, these general rechunk/take methods are implemented in arro3 now.

kylebarron avatar Aug 27 '24 02:08 kylebarron