delta-rs
delta-rs copied to clipboard
Add DataCatalog support
Environment
Delta-rs version: 0.13.0
Binding: Python (deltalake-0.13.0-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl)
Environment:
- OS: Linux/amd64
- Other:
Bug
When trying to use either flavor of DataCatalog a ValueError is thrown.
What happened:
❯ python3
Python 3.11.4 (main, Jun 28 2023, 19:51:46) [GCC] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from deltalake import DeltaTable, DataCatalog
>>> dt = DeltaTable.from_data_catalog(DataCatalog.AWS, 'db', 'table')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/tyler/source/github/noviconnect/venv/lib64/python3.11/site-packages/deltalake/table.py", line 287, in from_data_catalog
table_uri = RawDeltaTable.get_table_uri_from_data_catalog(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: Catalog 'glue' not available.
>>> dt = DeltaTable.from_data_catalog(DataCatalog.UNITY, 'db', 'table')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/tyler/source/github/noviconnect/venv/lib64/python3.11/site-packages/deltalake/table.py", line 287, in from_data_catalog
table_uri = RawDeltaTable.get_table_uri_from_data_catalog(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: Catalog 'unity' not available.
>>>
What you expected to happen:
C'mon son.
How to reproduce it:
More details:
take
I was unable to reproduce the glue error but I am on Mac. I'll keep on poking around but if you build for native-tls then this could be a reason (and few other places where glue feature is used alone): https://github.com/delta-io/delta-rs/blob/dd6b45362a14c0f127b32c4b81afc15d17f710d5/crates/deltalake-core/src/data_catalog/mod.rs#L141
As for the unity error, I suspect it could be a misleading error due to this, it should just return the original error as it has the right info, I'll change it: https://github.com/delta-io/delta-rs/blob/dd6b45362a14c0f127b32c4b81afc15d17f710d5/python/src/lib.rs#L136
@r3stl355 I have a feeling that this error might still exist in main albeit with better error messages. I think the problem is the Linux wheels don't have the glue feature enabled
I'll have a look, need to build myself a linux box, are you building with any specific settings or just using the standard build @rtyler ?
Hey, I need to understand the problem better here. I tried this in a docker container and an Ubuntu 22.04 VM on AWS using both a build from source and a released version(deltalake-0.13.0-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl) and I get something like that, which gives a meaningful error.
Traceback (most recent call last):
File "/home/ubuntu/delta-rs/python/issue_1860.py", line 4, in <module>
dt = DeltaTable.from_data_catalog(DataCatalog.AWS, 'db', 'table')
File "/home/ubuntu/delta-rs/python/deltalake/table.py", line 287, in from_data_catalog
table_uri = RawDeltaTable.get_table_uri_from_data_catalog(
OSError: Catalog glue error: Entity Not Found
@rtyler - what do I miss? I think that Entity not found error I am getting is coming from Glue, no?
Just confirmed, this is a Glue error from rusoto: https://github.com/delta-io/delta-rs/blob/fa6c5139033a06274dc829e0cf4053f72b0a9887/crates/deltalake-core/src/data_catalog/mod.rs#L62
reopening it since we likely want to re-add that once catalogs are working again.