evadb
evadb copied to clipboard
feat: Simplifying UDF definition
Problem
- Currently declaring UDFs is not simple, requiring a lot of boilerplate code, and obtuse input and output data types (example)
- Snowflake has a simple interface for UDFs (example), we could use.
Solution
- Modify
load_function_class_from_file
to handle regular python functions. - Now, if the required instance is a Python function instead of a class extending
AbstractFunction
, we create aUserDefinedFunction
, which is an instance ofAbstractFunction
, to wrap around the given Python function. - Changes in several other files to distinguish between a Python class and a Python instance, from the output of the modified
load_function_class_from_file
.
Integration Test
See the new integration test for an example usecase. We load UDFs from a file with the following content:
def mod5(id:int)->int:
return id%5
def isEven(id:int)->bool:
return id%2==0
There is no change in the EvaQL interface. This means that the following queries work:
CREATE FUNCTION mod5 IMPL '<file_location>';
CREATE FUNCTION isEven IMPL '<file_location>';
SELECT mod5(Id) from ....
Todo
- Creating a
UserDefinedFunction
wrapper is a hack to make the minimum functionality work within the current implementation. We need to explore other options like creating a separate executor for simple functions. - We assume that the functions take one row of the input at a time.
- This implementation is probably not as expressive as extending the
AbstractFunction
class.