torcharrow
torcharrow copied to clipboard
Eliminate offset and length in BaseColumn
trafficstars
We should remove the _offset and _length in BaseColumn:
https://github.com/facebookresearch/torcharrow/blob/d680bfdc0f6a6bb6c3a29c2a67d62006782d6558/csrc/velox/column.h#L223-L224
There are multiple places where we do not properly track this, such as in expression evaluation:
https://github.com/facebookresearch/torcharrow/blob/d680bfdc0f6a6bb6c3a29c2a67d62006782d6558/csrc/velox/column.cpp#L236-L238
We should be able to not track these in the BaseColumn anymore without losing any functionality.
We also may want to support UDF evaluation for different offsets, such as:
a = ta.Column([1, 2, 3])
b = ta.Column([10, 20, 30])
a[:2] + b[2:]
Slicing the vector with the BufferView might be the right solution.
cc: @wenleix