Gert Hulselmans comments

Results 446 comments of


                                            Gert Hulselmans

Additional lints for the Python code base

Fixes some: https://github.com/pola-rs/polars/pull/4486

Add `coerce_float` option to pl.from_arrow

You can cast to float64 on the pyarrow table with pyarrow, if you need it. Before converting it to a polars dataframe. It might get supported natively in the future....

Are there plans to support delta reader/writer?

I think it will be very unlikely. As far as I can see delta lake does not use the Arrow format, but requires spark . You can use to read...

perf(python,rust): Use faster compression backend for parquet

Keep in mind that there are alsos some kind of regression in zlib-ng related to gzipped CSV files (or at least with the rust binding). https://github.com/rust-lang/libz-sys/issues/104

Add streaming converters

Batched CSV reader was implemented in: https://github.com/pola-rs/polars/pull/5212

Search Engine Optimization for docs

https://blog.readthedocs.com/seo-for-technical-docs/ > Using a robots.txt and sitemap.xml file can help you control how search engines crawl your docs. For example, you could tell search engines to ignore unsupported versions of...

Polars panics when Azure Storage URI is used

You also have this crate, but I am not sure it supports everything needed: https://docs.rs/object_store/latest/object_store/

Python: deprecate most indexing operations.

I think other cases can indeed be removed. Slicing operations were greatly improved in https://github.com/pola-rs/polars/pull/3904

CSV: build categoricals directly

Here you have an example file in the same format: ``` curl https://temp.aertslab.org/.tsv/atac_fragments.head40000000.tsv.gz | zcat | tail -n +52 > fragments.head40000000.tsv ``` The test above was with local string cache....

CSV: build categoricals directly

It looks like it might indeed create the categorical upon reading. Creating a dictionary with pyarrow from the pl.Utf8 columns takes close to 9 seconds. ``` In [92]: %time chrom_pa_dict...