Gert Hulselmans

Results 446 comments of Gert Hulselmans

You can use `parquet-fromcsv` in the meantime to convert compressed CSV/TSV files to parquet and use `pl.scan_parquet` on them: https://github.com/pola-rs/polars/issues/9283#issuecomment-1594512827

> > You can use `parquet-fromcsv` in the meantime to convert compressed CSV/TSV files to parquet and use `pl.scan_parquet` on them: [#9283 (comment)](https://github.com/pola-rs/polars/issues/9283#issuecomment-1594512827) > > I think I'm understanding correctly...

You could try the `cassarrow` package to get PyArrow Tables from Cassandra. https://pypi.org/project/cassarrow/ It looks like it supports dates.

You can catch panic exceptions specificly, if you really need that option: ```python import numpy as np import polars as pl In [9]: try: ...: ...: ( ...: pl.DataFrame({"col1" :...

It does it you provide your fill value in the same type (float instead of int), else the series get casted to the supertype float64. ```python df2 = df.fill_null(value=0.0) df2.dtypes...

The following rust crate could be used to implement it eventually: https://docs.rs/unicode-normalization/latest/unicode_normalization/

It is easy to implement yourself with an apply (basically what pandas does: ```python import polars as pl import unicodedata In [72]: a = pl.Series("a", ["\u00C70123456", "\u00C70123456", "\u00C70123456"]) In [73]:...

You can chain multiple expressions: ```python df.select([ pl.col("names").unique().count().sum(),alias("sum_unique"). ]) ```

Probably we need a robots.txt file to tell search engines to reindex the pages e.g. every 3 days. The cached page is from 3 November.

You can also ask for a recrawl. Creating a sitemap might be useful too: https://developers.google.com/search/docs/crawling-indexing/ask-google-to-recrawl