mimir
mimir copied to clipboard

Published 20 hours ago •

Reame
Issues

Replace UDFs/UDAs with Spark's Catalog

Open okennedy opened this issue 5 years ago • 0 comments

At present, User-defined functions (UDFs) and User-defined aggregates (UDAs) can be defined either in Mimir-land or in Spark-land. Moreover,

Spark's UDA/UDF catalog implementation is virtually identical to Mimir's
There's a mountain of libraries that already support spark
Function and aggregate management is a non-trivial 1k lines of code (or more).

I propose that we defer to Spark's catalog to cut out a ton of redundant code from Mimir. This would require the following changes:

RAToSpark: Could now directly use the Spark catalog to instantiate functions (see the new MimirSQL for a few examples on how this might work)
Typechecker: Would need to use Spark's catalog to check types. This could get a little awkward, since Spark's and Mimir's typesystems differ. Would probably require RAToSQL to handle some translations.
Eval / EvalInline: Would now talk Spark for function execution

Dec 19 '19 22:12 okennedy

Labels

backend

compiler

eventually

Owner

Other Repo Issues