mars
mars copied to clipboard
Using aggregation instead of transform to perform `df.groupby().nunique()`
Now, df.groupby().nunique() would be delegated to transform to perform execution, it will be a shuffle operation which is very time consuming, we can delegate it to aggregation which is way more optimized.