evadb icon indicating copy to clipboard operation
evadb copied to clipboard

feat: Simplifying UDF definition

Open Prakhar314 opened this issue 1 year ago • 0 comments

Problem

  • Currently declaring UDFs is not simple, requiring a lot of boilerplate code, and obtuse input and output data types (example)
  • Snowflake has a simple interface for UDFs (example), we could use.

Solution

  • Modify load_function_class_from_file to handle regular python functions.
  • Now, if the required instance is a Python function instead of a class extending AbstractFunction, we create a UserDefinedFunction, which is an instance of AbstractFunction, to wrap around the given Python function.
  • Changes in several other files to distinguish between a Python class and a Python instance, from the output of the modified load_function_class_from_file.

Integration Test

See the new integration test for an example usecase. We load UDFs from a file with the following content:

def mod5(id:int)->int:
    return id%5
def isEven(id:int)->bool:
    return id%2==0

There is no change in the EvaQL interface. This means that the following queries work:

CREATE FUNCTION mod5 IMPL '<file_location>';
CREATE FUNCTION isEven IMPL '<file_location>';
SELECT mod5(Id) from ....

Todo

  • Creating a UserDefinedFunction wrapper is a hack to make the minimum functionality work within the current implementation. We need to explore other options like creating a separate executor for simple functions.
  • We assume that the functions take one row of the input at a time.
  • This implementation is probably not as expressive as extending the AbstractFunction class.

Prakhar314 avatar Nov 25 '23 06:11 Prakhar314