delta-rs
delta-rs copied to clipboard
When table is created with name, register it in data catalog
Description
#587 allows providing a name for a Delta table. However, it won't immediately work for users since we don't register it. We should provide a way to automatically register tables when they are created in a catalog. There may be other changes, like schema changes or deleting tables that will also require an update to the catalog.
Use Case
Related Issue(s)
@wjones127 can you give me a feedback on https://github.com/edmondo1984/delta-rs/commit/b524f66322a388fecb9670a95676329fe9df567c as a starting point? I am a little confused since I found that the call to the catalog was actually within the Python folder and not something shared by all bindings
I wonder if it would be better if we made table registration a separate call. Something like:
catalog = DataCatalog(service=DataCatalogService.Glue, id = "my-catalog-arn")
table = write_deltalake(data, "path/to/table", name="db.table_name")
catalog.register(table)
table = DeltaTable.from_data_catalog(catalog, "db.table_name")
Then you could have DataCatalog
in Python map closer to what the DataCatalog
trait and it’s concrete implementations are.
+1 on @wjones127 's explicit table registration proposal :) The separation of concerns here also avoids unnecessary overhead in table creation when users don't need to register it in the catalog.
I can look into this for the databricks unity catalog. It's one of the use cases I need to touch at work in a couple months.
@houqp @wjones127 is this still a desired feature?