iceberg
iceberg copied to clipboard
Action: support spark3 and customer catalog
In the spark2.4, the action uses the interface spark.read().format("iceberg").load(table)
to load the table as the dataset.
But It just can load the default spark catalog(spark_catalog
) table, and the table
is db.tablename
.
If I want to use the action to handle some tables which are in the other iceberg customer catalog, there will throw an exception referred to the #1652.
When you deal with DataFrameWriterV1 you may want to "forget about" catalog, and also want to "double-check about" how the "path" is interpreted for the format (data source).
Btw what's the use case we would like to touch some tables in other catalog?
I thought you could still load it directly from the path (ignoring the other catalogs?)
I think I understand the point now,
We want to invoke an action on a table whose identifier cannot be read using the current logic in BaseAction to generate metadata tables.
I think I understand the point now,
We want to invoke an action on a table whose identifier cannot be read using the current logic in BaseAction to generate metadata tables.
@RussellSpitzer
I'm working on the SQL extension(remove orphan files), and want to use the spark action to implement it. But the current actions just support spark2.
Yep we'll have to make some modifications. I think probably a first step is just only making the extensions work for the default catalog. But otherwise we need to start extending some of the methods in the base class so they properly handle multi-part identifiers and we determine how to load up metadata tables properly for tables in other catalogs.
Yep we'll have to make some modifications. I think probably a first step is just only making the extensions work for the default catalog. But otherwise we need to start extending some of the methods in the base class so they properly handle multi-part identifiers and we determine how to load up metadata tables properly for tables in other catalogs.
There are pull requests about the modifications?
Not yet, but obviously we will need to make some :)
Is this #1525 pr is about supporting spark3 action? @RussellSpitzer
This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.
This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale'