Andrew Duffy
Andrew Duffy
Blocked on https://github.com/apache/datafusion/pull/11518 and https://github.com/apache/arrow-rs/pull/6077
Overall goal is to be able to turn on `#[deny(missing_docs)]` for every crate
Currently we implement IntoArrayVariant for all T: IntoCanonical by first calling into_canonical and then downcasting. This hides that there is a potentially expensive decoding step going on. We should either...
We're removing until DataFusion has better support, see https://github.com/apache/datafusion/issues/10918
Adds `scalars_dtype` to `ExtDType`. This PR adds `scalars_dtype` (alternative name options: `canonical_dtype`, `storage_dtype`) to `ExtDType`. This is desirable for a few reasons * Makes it possible to canonicalize an empty...
Supplants #476. VarBinView is the new canonical representation for string types across the repo. There are still many places that natively use VarBin arrays internally, we can replace those over...
Useful for compressor to decide if Dict compression is worthwhile. There's a Rust crate already implementing it: https://docs.rs/hyperloglogplus/latest/hyperloglogplus/struct.HyperLogLogPlus.html Can be used: - At compress time: determine if Dict is worth...
Currently we pass around `EncodingRef` to represent an encoding. It's type-defed to be a `&'static dyn Encoding`. Most of our encodings are zero-sized structs, so this is pretty trivial to...