Andrew Duffy
Andrew Duffy
some options: * `canonicalize_to_primitive` * `into_canonical_primitive` * `to_canonical_primitive` * `to_primitive` Another option is to replace methods with top-level function calls, e.g. `canonical::to_primitive(array)`
Capturing from slack: Currently our tableprovider's pushdown is bottlenecked by take(varbin)  DataFusion defers to arrow's filter_bytes function to turn the predicate mask into new ArrayRef: https://github.com/apache/arrow-rs/blob/920a94470db04722c74b599a227f930946d0da80/arrow-select/src/filter.rs#L660-L689 We want to...
Yea, even bumping PyArrow from 15 -> 17 (latest) did not seem to change that
Blocked on https://github.com/apache/arrow-rs/pull/6368
Converting back to draft while this is blocked
Time for a take3 PR 🥲
Moving to draft to clean things up. @joseph-isaacs We use Sparse as a standalone compressor in vortex-btrblocks so I'd expect it to be more frequent in our encoding trees
@gatesn curious for your thoughts as i'm hitting some friction points in this refactor. I have the ExtensionType trait defined which is the in-memory deserialized/owned version of an ext type....
I think you still need to support conversion to an Arrow schema outside of `ToArrow`. e.g. for Datafusion?
I think this is currently blocked on https://github.com/astral-sh/uv/issues/5903 The alternative is that we just don't use the scripts