vortex
vortex copied to clipboard
A toolkit for working with compressed array data
There's a lot of cleanup to do here but this performs pruning, however, we don't yet make use of that pruning result
the compressed array is semantically equivalent -> has the same stats as the original, so any that have been computed should be populated in the compressed array's StatsSet relatedly, some...
Currently it only happens when asking for children. The annoying part of this change is that the error message will not be the best
Changing one small thing in the file will not necessarily alter all the contents of the file
Right now if StructArray has top level validity that's not `AllValid` or `NonNullable` the validity is discarded
Supplants #476. VarBinView is the new canonical representation for string types across the repo. There are still many places that natively use VarBin arrays internally, we can replace those over...
Useful for compressor to decide if Dict compression is worthwhile. There's a Rust crate already implementing it: https://docs.rs/hyperloglogplus/latest/hyperloglogplus/struct.HyperLogLogPlus.html Can be used: - At compress time: determine if Dict is worth...
Vortex reader will collect all read requests from layouts and dispatch them together https://github.com/spiraldb/vortex/blob/develop/vortex-serde/src/layouts/read/stream.rs#L192. However, this is extremely naive and doesn't leverage additional knoweldge we have about file format to...
Right now we have a top level readme which we propagate to all crates, however, we can likely have a smaller and more targeted readmes at just the individual packages