Daft icon indicating copy to clipboard operation
Daft copied to clipboard

[FEAT] User-defined global expressions

Open kevinzwang opened this issue 11 months ago • 0 comments

User-defined global expressions, similar to typical UDFs, are Python functions that users can use as expressions. However, what is different about global expressions is that they produce a value with cardinality 1.

There should be two user-defined global expression (UDGF?) types:

  • [ ] One function that takes in entire columns. Requires coalescing all data into a single partition
  • [ ] Two functions, one for taking in an accumulator + row or partition and outputting a new accumulator, and another for converting an accumulator into a final value. This can be used to efficiently express reduce operations that are commutative and associative

kevinzwang avatar Mar 06 '24 08:03 kevinzwang