mars
mars copied to clipboard
Using aggregation instead of transform to perform `df.groupby().nunique()`
Now, df.groupby().nunique()
would be delegated to transform
to perform execution, it will be a shuffle operation which is very time consuming, we can delegate it to aggregation
which is way more optimized.