delta-rs icon indicating copy to clipboard operation
delta-rs copied to clipboard

Can't read a Delta table from Azure Unity Catalog

Open MigQ2 opened this issue 1 year ago • 8 comments

Environment

  • Linux
  • python 3.10.10
  • deltalake==0.10.2

Environment:

  • Cloud provider: Azure Databricks

Bug

What happened:

I am trying to replicate this example from the documentation to read a Delta Table from Databricks Unity Catalog:

from deltalake import DataCatalog, DeltaTable
catalog_name = 'main'
schema_name = 'db_schema'
table_name = 'db_table'
data_catalog = DataCatalog.UNITY
dt = DeltaTable.from_data_catalog(data_catalog=data_catalog, data_catalog_id=catalog_name, database_name=schema_name, table_name=table_name)

but I get the following error:

OSError: Generic MicrosoftAzure error: Error performing token request: response error "request error", after 10 
retries: error sending request for url 
(http://<SOME-IP-ADDRESS>/metadata/identity/oauth2/token?api-version=2019-08-01&resource=https%3A%2F%2Fstorage.azure.
com): error trying to connect: tcp connect error: Connection refused (os error 111)

Stacktrace:

 /home/vscode/.local/lib/python3.10/site-packages/deltalake/table.py:285 in from_data_catalog     │
│                                                                                                  │
│   282 │   │   │   database_name=database_name,                                                   │
│   283 │   │   │   table_name=table_name,                                                         │
│   284 │   │   )                                                                                  │
│ ❱ 285 │   │   return cls(                                                                        │
│   286 │   │   │   table_uri=table_uri, version=version, log_buffer_size=log_buffer_size          │
│   287 │   │   )                                                                                  │
│   288                                                                                            │
│                                                                                                  │
│ /home/vscode/.local/lib/python3.10/site-packages/deltalake/table.py:246 in __init__              │
│                                                                                                  │
│   243 │   │                                                                                      │
│   244 │   │   """                                                                                │
│   245 │   │   self._storage_options = storage_options                                            │
│ ❱ 246 │   │   self._table = RawDeltaTable(                                                       │
│   247 │   │   │   str(table_uri),                                                                │
│   248 │   │   │   version=version,                                                               │
│   249 │   │   │   storage_options=storage_options, 

What you expected to happen:

I wish I could read the Delta Table

More details:

  • I can read from the storage account where the data is located using other libraries in the same python interpreter so I don't think it's a firewall problem
  • The same host and token work perfectly fine in the same interpreter to read data from the same Unity Catalog table using databricks-connect, so the URL and token are valid

MigQ2 avatar Sep 14 '23 23:09 MigQ2