Daft
Daft copied to clipboard
[FEAT] User-defined global expressions
User-defined global expressions, similar to typical UDFs, are Python functions that users can use as expressions. However, what is different about global expressions is that they produce a value with cardinality 1.
There should be two user-defined global expression (UDGF?) types:
- [ ] One function that takes in entire columns. Requires coalescing all data into a single partition
- [ ] Two functions, one for taking in an accumulator + row or partition and outputting a new accumulator, and another for converting an accumulator into a final value. This can be used to efficiently express reduce operations that are commutative and associative