Gert Hulselmans
Gert Hulselmans
Fixes some: https://github.com/pola-rs/polars/pull/4486
You can cast to float64 on the pyarrow table with pyarrow, if you need it. Before converting it to a polars dataframe. It might get supported natively in the future....
I think it will be very unlikely. As far as I can see delta lake does not use the Arrow format, but requires spark . You can use to read...
Keep in mind that there are alsos some kind of regression in zlib-ng related to gzipped CSV files (or at least with the rust binding). https://github.com/rust-lang/libz-sys/issues/104
Batched CSV reader was implemented in: https://github.com/pola-rs/polars/pull/5212
https://blog.readthedocs.com/seo-for-technical-docs/ > Using a robots.txt and sitemap.xml file can help you control how search engines crawl your docs. For example, you could tell search engines to ignore unsupported versions of...
You also have this crate, but I am not sure it supports everything needed: https://docs.rs/object_store/latest/object_store/
I think other cases can indeed be removed. Slicing operations were greatly improved in https://github.com/pola-rs/polars/pull/3904
Here you have an example file in the same format: ``` curl https://temp.aertslab.org/.tsv/atac_fragments.head40000000.tsv.gz | zcat | tail -n +52 > fragments.head40000000.tsv ``` The test above was with local string cache....
It looks like it might indeed create the categorical upon reading. Creating a dictionary with pyarrow from the pl.Utf8 columns takes close to 9 seconds. ``` In [92]: %time chrom_pa_dict...