vortex icon indicating copy to clipboard operation
vortex copied to clipboard

consider changing take for varbin

Open a10y opened this issue 2 months ago • 1 comments

from a user report:

Trying to make sense of this: take on a chunked struct array ~ 600ms converting chunked array to arrow record batch and doing take ~ 25ms

what seems likely to be happening is that varbin.to_arrow is casting the varbin to a varbinview (relatively cheap) and then doing take on the views (extremely cheap).

we should consider killing the intrinsic varbin take impl and just converting to varbinview on take

a10y avatar Oct 16 '25 11:10 a10y

We've also ran into ChunkedArray::take being really bad in the "shuffle" case where the indices aren't sorted, potentially creating O(len(take_array)) chunks.

AdamGS avatar Oct 16 '25 11:10 AdamGS