Orson Peters

Results 265 comments of Orson Peters
trafficstars

@daviskirk Categoricals are currently the only thing that are simply broken on the new streaming engine (to my knowledge). Almost all unsupported things automatically fall back to the eager engine...

@gdementen Note that the new streaming engine as it is today can already run a lot of queries on (much) larger datasets than memory. The queries that aren't out of...

@vultix Maybe eventually, but not anytime soon.

@velochy They should be fixed right now - internally we added a workaround that forces all categoricals to go through the global key map. If you find any bugs please...

@velochy For the time being you have to use `collect(new_streaming=True)`, not `streaming=True`. And `explain` is not yet implemented for `new_streaming`, currently you have to specify the env variable `POLARS_VISUALIZE_PHYSICAL_PLAN="somefile.dot"` and...

@velochy These questions don't belong in this tracking issue, sorry. I'd suggest asking in the Discord server to see if anyone could look with you.

> Understood. Should I delete them to clean the thread? No need, I marked them as off-topic. > Will `unpivot` be supported as a streaming op? `unpivot` is planned, yes,...

This proposal has a **huge** problem which `nan_counts` does not have: NaNs in the dataset can poison the statistics. If the dataset contains both a signed positive NaN and a...

There is another issue with this proposal in my opinion: it adds semantics to the sign bit of `NaN`s. This is incredibly dangerous, not all data systems (e.g. Polars, but...

> Are we willing to special case float to make filtering in the presence of NaNs more efficient, or do we go with a more streamlined implementation without special fields...