iceberg
iceberg copied to clipboard
Nessie: Use unique path for different table with same name
In Nessie catalog, Each branch can have a new table with the same name as a table in another branch. When this happens, Nessie is using the same table path two different tables (when same warehouse location is configured).
So, use the table UUID in the table path to avoid two different tables sharing the same path.
Steps:
docker run -p 19120:19120 projectnessie/nessie
/Users/ajantha/Documents/tools/spark-3.0.3-bin-hadoop2.7/bin/spark-sql --packages org.slf4j:slf4j-api:1.7.32 --jars /Users/ajantha/Downloads/iceberg-spark-runtime-3.0_2.12-0.14.0-SNAPSHOT.jar,/Users/ajantha/Downloads/nessie-spark-extensions-0.28.0.jar \
--conf spark.sql.catalog.nessie=org.apache.iceberg.spark.SparkCatalog\
--conf spark.sql.catalog.nessie.warehouse=$PWD/nessie2 \
--conf spark.sql.catalog.nessie.catalog-impl=org.apache.iceberg.nessie.NessieCatalog \
--conf spark.sql.catalog.nessie.uri=http://localhost:19120/api/v1 \
--conf spark.sql.catalog.nessie.ref=main \
--conf spark.sql.catalog.nessie.auth-type=NONE \
--conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions,org.projectnessie.spark.extensions.NessieSparkSessionExtensions
create branch dev in nessie from main;
use reference dev in nessie;
create table nessie.db1.t1(id int) using iceberg;
insert into nessie.db1.t1 select 42;
use reference main in nessie;
create table nessie.db1.t1(place string, name string) using iceberg;
insert into nessie.db1.t1 select 'x','y';
-- Now the path $warehouse/db1/t1
has data and metadata files from both the tables.
Why this needs to be fixed ?
remove_orphan
files does a list files (BaseDeleteOrphanFilesSparkAction#buildActualFileDF()
-> listDirsRecursively
) and try to remove the files that are not reachable from metadata. So, it will clean up other table in this case.
Not sure if there are any other Iceberg operations that do list files. If it is there. It will get impacted too.
cc: @nastra, @snazy, @RussellSpitzer, @rdblue
ping...
@snazy , @nastra : Addressed the comments. Please have a look at it again.
@RussellSpitzer could you review/merge this one please?
@ajantha-bhat, do you think Nessie could move to the REST catalog API? That API fixes several problems, like the one here. The table location should be something that the catalog endpoint can override, rather than requiring the client to produce the correct one.
@ajantha-bhat, do you think Nessie could move to the REST catalog API? That API fixes several problems, like the one here. The table location should be something that the catalog endpoint can override, rather than requiring the client to produce the correct one.
I am not sure at the moment. I believe REST catalog is a client for some SERVER that implements the endpoints for these REST catalog calls. Nessie already has its own end points which are not same as what REST catalog expects. I think it is too late to change Nessie endpoints as it will break the compatibility. I might be wrong here. Please correct me If I am mistaken.
I rebased and resolved the conflict.
I'm a bit out-of-date of the REST catalog impl., but could it have a common base class with the NessieCatalog perhaps?.. just an idea (for a future change)
@rdblue , @RussellSpitzer : As there was no response, I assumed people are more inclined toward modifying the default warehouse location. I have modified it as the same. Please help in merging this issue fix.
I'm a bit out-of-date of the REST catalog impl., but could it have a common base class with the NessieCatalog perhaps?.. just an idea (for a future change)
I think what makes the most sense is to eventually use the same protocol, especially since the REST protocol solves a number of these challenges (like delegating location defaults to the service). It also uses a diff-based approach that can avoid a lot more conflicts. I'm not sure how easy it would be for Nessie to update to be compatible with that client API, though. And I imagine it isn't a high priority just to avoid commit conflicts and enable metadata caching.
@rdblue: I have squashed the commits and rebased because of the spotless apply.
@rdblue , @RussellSpitzer : ping...
ping @rdblue
@gaborkaszab: I would like to get this one in 1.1 release.
@RussellSpitzer , @nastra : I have reverted the base test class changes and handled the nit.
@RussellSpitzer, @rdblue: Can we please merge this PR if it is ok?
@RussellSpitzer: Thanks for the review and approval.
I think we can merge this as there is already a consensus for the fix (from everyone). The confusion and discussions were mainly about base test case refactoring. This can be done in follow-up too if there is any concern.
@gaborkaszab: Can we please add 1.1.0 milestone
label for this?