arrow2
arrow2 copied to clipboard
Transmute-free Rust library to work with the Arrow format
version: `{ git = "https://github.com/jorgecarleitao/arrow2", rev = "8604cb760b8ac475d7968b714d47e4ff714c61a1", default-features = false }` rust version: ```sh > rustup show active toolchain ---------------- nightly-x86_64-unknown-linux-gnu (directory override for '/') rustc 1.64.0-nightly (0f4bcadb4 2022-07-30)...
Closed #1206. Thanks a lot @sigurdteigen for the report!
Hi. Parsing parquet files with a large number of row groups and using the row `groups_filter` causes a stack overflow for me. The reason seems to be the recursive call...
Arrow version: latest rust version: `rustc 1.64.0-nightly (0f4bcadb4 2022-07-30)` The `ndjson::infer` method takes in a reader, and num_rows as seen [here](https://github.dev/jorgecarleitao/arrow2/blob/061d6ba7f478c8678c2aa803f61dc6fd987579aa/src/io/ndjson/read/file.rs#L102) `json::infer` only takes in a `json_deserializer::Value` It would be...
This PR removes from the public API internal methods and exposes them with more intuitive APIs. This reduces the mental load of the IPC API public surface to users, thereby...
Unfortunately I could not make it without adding an extra (small and no indirect dependencies) dependency, thus the draft. It should be possible, as we basically just need implementations for...
This PR is a follow-up on https://github.com/jorgecarleitao/parquet2/pull/160, bringing its design changes here. The main idea of this PR is that we no longer use the in-memory format for dictionary pages...
As we are getting more and more datatypes.🎉 The probability of not needing all of them increases. Because of our `dyn Array` type we have to compile all generic code...
Hi there. I have crafted a [polars](https://github.com/pola-rs/polars) dataframe, and saved it to disk in parquet/feather format. But this file cannot be read back into memory using `arrow2`. It can be...