ibis
ibis copied to clipboard
bug(duckdb, UDF): using the same UDF in two filed-based connections errors
What happened?
This error happens with motherduck and a file-based connection. :memory: connections don't show the error.
from pathlib import Path
import ibis
@ibis.udf.scalar.python
def myfunc(x: int) -> int:
return x + 1
def f(url):
y = myfunc(ibis.literal(1))
ibis.duckdb.connect(url).execute(y)
ibis.duckdb.connect(url).execute(y)
p = Path("test.duckdb")
p.unlink(missing_ok=True)
# f(p) # error
# f("motherduck:") # error
# f(":memory:") # no error
I am running into this in a solara application, which does hot code reloading, so every time I edit a page, my get_db() function gets re-executed, I get a new connection, and this error appears. I think I can work around it, but it would be great if it were fixed.
What version of ibis are you using?
main
What backend(s) are you using, if any?
duckdb
Relevant log output
---------------------------------------------------------------------------
CatalogException Traceback (most recent call last)
Cell In[1], line 19
16 p = Path("test.duckdb")
17 p.unlink(missing_ok=True)
---> 19 f(p) # error
20 # f("motherduck:") # error
21 # f(":memory:") # no error
Cell In[1], line 13
11 y = myfunc(ibis.literal(1))
12 ibis.duckdb.connect(url).execute(y)
---> 13 ibis.duckdb.connect(url).execute(y)
File ~/code/ibis/ibis/backends/duckdb/__init__.py:1347, in Backend.execute(self, expr, params, limit, **_)
1344 import pandas as pd
1345 import pyarrow.types as pat
-> 1347 table = self._to_duckdb_relation(expr, params=params, limit=limit).arrow()
1349 df = pd.DataFrame(
1350 {
1351 name: (
(...)
1363 }
1364 )
1365 df = DuckDBPandasData.convert_table(df, expr.as_table().schema())
File ~/code/ibis/ibis/backends/duckdb/__init__.py:1278, in Backend._to_duckdb_relation(self, expr, params, limit)
1262 def _to_duckdb_relation(
1263 self,
1264 expr: ir.Expr,
(...)
1267 limit: int | str | None = None,
1268 ):
1269 """Preprocess the expr, and return a ``duckdb.DuckDBPyRelation`` object.
1270
1271 When retrieving in-memory results, it's faster to use `duckdb_con.sql`
(...)
1276 `duckdb_con.execute` everywhere else.
1277 """
-> 1278 self._run_pre_execute_hooks(expr)
1279 table_expr = expr.as_table()
1280 sql = self.compile(table_expr, limit=limit, params=params)
File ~/code/ibis/ibis/backends/duckdb/__init__.py:1260, in Backend._run_pre_execute_hooks(self, expr)
1257 if expr.op().find((ops.GeoSpatialUnOp, ops.GeoSpatialBinOp)):
1258 self.load_extension("spatial")
-> 1260 super()._run_pre_execute_hooks(expr)
File ~/code/ibis/ibis/backends/__init__.py:1029, in BaseBackend._run_pre_execute_hooks(self, expr)
1027 """Backend-specific hooks to run before an expression is executed."""
1028 self._define_udf_translation_rules(expr)
-> 1029 self._register_udfs(expr)
1030 self._register_in_memory_tables(expr)
File ~/code/ibis/ibis/backends/duckdb/__init__.py:1534, in Backend._register_udfs(self, expr)
1532 registration_func = compile_func(udf_node)
1533 if registration_func is not None:
-> 1534 registration_func(con)
File ~/code/ibis/ibis/backends/duckdb/__init__.py:1547, in Backend._compile_udf.<locals>.register_udf(con)
1546 def register_udf(con):
-> 1547 return con.create_function(
1548 name,
1549 func,
1550 input_types,
1551 output_type,
1552 type=_UDF_INPUT_TYPE_MAPPING[udf_node.__input_type__],
1553 )
CatalogException: Catalog Error: Scalar Function with name "myfunc_0" already exists!
Code of Conduct
- [X] I agree to follow this project's Code of Conduct