Ritchie Vink

Results 992 comments of Ritchie Vink

One addendum on the copy from/to numpy. In the new Polars release `0.20.7`. `F-order` numpy data can now be moved in and out of a Polars `DataFrame` zero-copy. ```python import...

Sorry for the confusion. See comment: https://github.com/pola-rs/polars/issues/10054#issuecomment-2025127965

```python df = pl.DataFrame({ "a": [[]], "b": [1] }, schema={"a": pl.List(pl.Struct([pl.Field("a", pl.Utf8)])), "b": pl.Int32}) df.melt() ```

We need to implement a cast from struct to utf8 in polars-arrow to fix this.

Yes, we got that on the radar. First thing we want to do is add streaming support for `scan_pyarrow_dataset`.

> Currently, we use V2 data pages for Parquet. This comes with [unclear advantages](https://stackoverflow.com/questions/77654784/what-is-the-difference-between-data-page-version-1-0-and-2-0-in-parquet-files) and one significant disadvantage. Unlike V1 data pages, V2 data pages do not compress definition and...

> That makes sense. I'd like to revisit this later, but I think I'll need to learn some more Rust before tackling that problem. Ok, I think there is value...

Thank you for the PR @thalassemia

We had to revert this because of https://github.com/pola-rs/polars/issues/16109 I do think these changes were interesting, so if you or anyone else has time to find the cause, that'd be appreciated.

We could use that. But then you first need to come up with an order preserving encoding/ decoding for decimals. Then we can dispatch to the row-encoding for those types.