mars icon indicating copy to clipboard operation
mars copied to clipboard

Support full featured groupby function

Open wjsi opened this issue 5 years ago • 0 comments

Currently df.groupby() in Mars only implements as_index. sort is added but not implemented yet, while level is not implemented which is also useful. What's more, when GroupBy object is generated, keys and grouped data frames are serialized instead of GroupBy object itself, which may not be a universal solution for different scenarios of groupby.

sort_index (#1037) may be useful when implementing this.

Subtasks:

  • [x] Fix serialization issue of GroupBy objects (#1136)
  • [x] Support groupby-getitem (#1136)
  • [x] Support level option (#1136)
  • [x] Support group_keys (#1136)
  • [ ] Support observed
  • [x] Support sort (#2959)
  • [x] Support args of cumulative functions in groupby (and integrate df.sum(level=x, **kwargs))
  • [x] Support grouping by series (#1181)

wjsi avatar Mar 05 '20 13:03 wjsi