ibis icon indicating copy to clipboard operation
ibis copied to clipboard

bug(duckdb, UDF): using the same UDF in two filed-based connections errors

Open NickCrews opened this issue 1 year ago • 0 comments

What happened?

This error happens with motherduck and a file-based connection. :memory: connections don't show the error.

from pathlib import Path
import ibis


@ibis.udf.scalar.python
def myfunc(x: int) -> int:
    return x + 1


def f(url):
    y = myfunc(ibis.literal(1))
    ibis.duckdb.connect(url).execute(y)
    ibis.duckdb.connect(url).execute(y)


p = Path("test.duckdb")
p.unlink(missing_ok=True)

# f(p)  # error
# f("motherduck:") # error
# f(":memory:")  # no error

I am running into this in a solara application, which does hot code reloading, so every time I edit a page, my get_db() function gets re-executed, I get a new connection, and this error appears. I think I can work around it, but it would be great if it were fixed.

What version of ibis are you using?

main

What backend(s) are you using, if any?

duckdb

Relevant log output

---------------------------------------------------------------------------
CatalogException                          Traceback (most recent call last)
Cell In[1], line 19
     16 p = Path("test.duckdb")
     17 p.unlink(missing_ok=True)
---> 19 f(p)  # error
     20 # f("motherduck:") # error
     21 # f(":memory:")  # no error

Cell In[1], line 13
     11 y = myfunc(ibis.literal(1))
     12 ibis.duckdb.connect(url).execute(y)
---> 13 ibis.duckdb.connect(url).execute(y)

File ~/code/ibis/ibis/backends/duckdb/__init__.py:1347, in Backend.execute(self, expr, params, limit, **_)
   1344 import pandas as pd
   1345 import pyarrow.types as pat
-> 1347 table = self._to_duckdb_relation(expr, params=params, limit=limit).arrow()
   1349 df = pd.DataFrame(
   1350     {
   1351         name: (
   (...)
   1363     }
   1364 )
   1365 df = DuckDBPandasData.convert_table(df, expr.as_table().schema())

File ~/code/ibis/ibis/backends/duckdb/__init__.py:1278, in Backend._to_duckdb_relation(self, expr, params, limit)
   1262 def _to_duckdb_relation(
   1263     self,
   1264     expr: ir.Expr,
   (...)
   1267     limit: int | str | None = None,
   1268 ):
   1269     """Preprocess the expr, and return a ``duckdb.DuckDBPyRelation`` object.
   1270 
   1271     When retrieving in-memory results, it's faster to use `duckdb_con.sql`
   (...)
   1276     `duckdb_con.execute` everywhere else.
   1277     """
-> 1278     self._run_pre_execute_hooks(expr)
   1279     table_expr = expr.as_table()
   1280     sql = self.compile(table_expr, limit=limit, params=params)

File ~/code/ibis/ibis/backends/duckdb/__init__.py:1260, in Backend._run_pre_execute_hooks(self, expr)
   1257 if expr.op().find((ops.GeoSpatialUnOp, ops.GeoSpatialBinOp)):
   1258     self.load_extension("spatial")
-> 1260 super()._run_pre_execute_hooks(expr)

File ~/code/ibis/ibis/backends/__init__.py:1029, in BaseBackend._run_pre_execute_hooks(self, expr)
   1027 """Backend-specific hooks to run before an expression is executed."""
   1028 self._define_udf_translation_rules(expr)
-> 1029 self._register_udfs(expr)
   1030 self._register_in_memory_tables(expr)

File ~/code/ibis/ibis/backends/duckdb/__init__.py:1534, in Backend._register_udfs(self, expr)
   1532 registration_func = compile_func(udf_node)
   1533 if registration_func is not None:
-> 1534     registration_func(con)

File ~/code/ibis/ibis/backends/duckdb/__init__.py:1547, in Backend._compile_udf.<locals>.register_udf(con)
   1546 def register_udf(con):
-> 1547     return con.create_function(
   1548         name,
   1549         func,
   1550         input_types,
   1551         output_type,
   1552         type=_UDF_INPUT_TYPE_MAPPING[udf_node.__input_type__],
   1553     )

CatalogException: Catalog Error: Scalar Function with name "myfunc_0" already exists!

Code of Conduct

  • [X] I agree to follow this project's Code of Conduct

NickCrews avatar Apr 10 '24 20:04 NickCrews