r-polars icon indicating copy to clipboard operation
r-polars copied to clipboard

Remove `GroupBy` (and others) in favor of `LazyGroupBy`?

Open etiennebacher opened this issue 1 year ago • 2 comments

While reading the Rust source code, I noticed that aggregate calculation with GroupBy has been deprecated for years (pola-rs/polars#4946). Perhaps at some point, given the current complexity, we may as well remove GroupBy and use only LazyGroupBy. https://docs.rs/polars/0.36.2/polars/prelude/struct.GroupBy.html

Originally posted by @eitsupi in https://github.com/pola-rs/r-polars/issues/694#issuecomment-1890540599

etiennebacher avatar Jan 13 '24 16:01 etiennebacher

My point was that it may be better to remove the eager group by in favor of lazy API, as the additional classes make it quite confusing for beginners. py-polars has three classes to reproduce the group by in the eager API It has three classes to reproduce group by in the eager API, but two (rolling and dynamic) are not shown in the documentation.......

eitsupi avatar Jan 13 '24 16:01 eitsupi

I think we're documenting this the right way and that the python-polars API docs should be updated. In their DataFrame reference page, they should add the categories DynamicGroupBy and RollingGroupBy, just like they have a category GroupBy.

Those are not identical: one can apply .first() on a GroupBy object, but not on a RollingGroupBy object for instance.

In any case, I'd prefer this to be clarified in the python API before spending more time on this here.

etiennebacher avatar Jan 13 '24 16:01 etiennebacher

This does not seem to be a problem since this is the same implementation approach as Python Polars.

eitsupi avatar Jun 14 '24 14:06 eitsupi