DrMaphuse comments

Results 18 comments of


                                            DrMaphuse

Recognize integer timestamps when `dtypes=pl.Datetime` is passed to `read_csv`

Thanks for the thorough response! I totally understand that it doesn't make sense if it adds too much complexity. The suggested workaround is neat, but it relies on schema inference,...

Cannot create Hyper Files on mounted webdav volumes despite full write permission.

I can't reproduce the problem easily myself now, because we switched to another filesystem for other reasons, and that made the problem go away. I suspect, however, that it had...

Unique / Duplicate Enhancements

> `is_unique` should only have one answer, if a column value is unique How can I replicate the `subset=` argument of `unique()` using `is_first`, i.e. evaluate uniqueness across multiple columns?...

Unique / Duplicate Enhancements

In my head, it would make sense to add `pl.is_first()`, analogous to `pl.sum()`. Implementation of `is_first` for `pl.list` dtype could also give us a potential solution.

Add argument to `pl.concat_str()` to treat `Null` as empty string

@ritchie46 Thanks, that is perfect. The original motivation for opening this issue was that I feel that implicitly omitting values based on the values of other columns is not ideal....

Limit the size of Exception strings

There was a join / select with a missing column somewhere in the middle of the script and it appears that this caused a chain of subsequent errors to pile...

`ArrowErrorException` in Lazy mode after `scan_parquet`, `coalesce` and `whenthen`

@ritchie46 I actually managed to produce a minimal example, see updated OP. The issue does not appear to be related to the parquet file as originally thought.

update_hyper_data Payload Too Large

I can confirm that this limit makes incremental updates unfeasible for larger datasets. I am trying to insert a delta of about 1GB every day, and uploading that in chunks...