Nicholas Gates
Nicholas Gates
Pull what we want to expose into vortex::* or a prelude
Rather than row-groups as we do now
We should have: • LocalTime ([time unit] after midnight) - Arrow time32 or time64 • LocalDate (julian day) - Arrow date32 • LocalDateTime (julian day and [time unit] after midnight)...
Should be computed separately from existing stats to avoid expensive overheads when not performing compression.
It's misleading to have `new()` call `try_new().unwrap()` since it obscures the lack of safety. I'd propose we have a `new_unchecked` and `new() Result` where unchecked can be used as an...
The stats of a compressed array should equal the stats of the uncompressed array. Further, the stats from the compressed array can be used to populate the stats of compressed...
Since dtype is logical type, we should distinguish between uint8 and a byte (with underlying u8 ptype). This will allow us to perform different compression strategies. e.g. not much point...
As a chunked array loops over its chunks, the compressor could pick the most recently used compression scheme provided the resulting compression ratio remains within some bound. This would short-cut...
### Describe the bug, including details regarding any error messages, version, and platform. PyArrow 16.1.0 on MacOS ```python import pyarrow as pa import pyarrow.compute as pc pc.indices_nonzero(pc.is_valid(pa.chunked_array([pa.array([0])])[0:0])) ``` ### Component(s)...