delta-rs icon indicating copy to clipboard operation
delta-rs copied to clipboard

When table is created with name, register it in data catalog

Open wjones127 opened this issue 2 years ago • 3 comments

Description

#587 allows providing a name for a Delta table. However, it won't immediately work for users since we don't register it. We should provide a way to automatically register tables when they are created in a catalog. There may be other changes, like schema changes or deleting tables that will also require an update to the catalog.

Use Case

Related Issue(s)

wjones127 avatar Apr 17 '22 00:04 wjones127

@wjones127 can you give me a feedback on https://github.com/edmondo1984/delta-rs/commit/b524f66322a388fecb9670a95676329fe9df567c as a starting point? I am a little confused since I found that the call to the catalog was actually within the Python folder and not something shared by all bindings

edmondop avatar Jul 03 '22 00:07 edmondop

I wonder if it would be better if we made table registration a separate call. Something like:

catalog = DataCatalog(service=DataCatalogService.Glue, id = "my-catalog-arn")
table = write_deltalake(data, "path/to/table", name="db.table_name")
catalog.register(table)
table = DeltaTable.from_data_catalog(catalog, "db.table_name")

Then you could have DataCatalog in Python map closer to what the DataCatalog trait and it’s concrete implementations are.

wjones127 avatar Jul 09 '22 01:07 wjones127

+1 on @wjones127 's explicit table registration proposal :) The separation of concerns here also avoids unnecessary overhead in table creation when users don't need to register it in the catalog.

houqp avatar Jul 13 '22 04:07 houqp

I can look into this for the databricks unity catalog. It's one of the use cases I need to touch at work in a couple months.

@houqp @wjones127 is this still a desired feature?

ion-elgreco avatar Oct 08 '23 18:10 ion-elgreco