mars
mars copied to clipboard
Support full featured groupby function
Currently df.groupby() in Mars only implements as_index. sort is added but not implemented yet, while level is not implemented which is also useful. What's more, when GroupBy object is generated, keys and grouped data frames are serialized instead of GroupBy object itself, which may not be a universal solution for different scenarios of groupby.
sort_index (#1037) may be useful when implementing this.
Subtasks:
- [x] Fix serialization issue of GroupBy objects (#1136)
- [x] Support groupby-getitem (#1136)
- [x] Support level option (#1136)
- [x] Support
group_keys(#1136) - [ ] Support
observed - [x] Support
sort(#2959) - [x] Support args of cumulative functions in groupby (and integrate
df.sum(level=x, **kwargs)) - [x] Support grouping by series (#1181)