nessie icon indicating copy to clipboard operation
nessie copied to clipboard

[Bug]: GC causing "org.apache.iceberg.exceptions.NotFoundException: File does not exist"

Open dorsegal opened this issue 1 year ago • 14 comments

What happened

After I used GC I started to get file does not exist error. Looks like the file was deleted but was not deleted from metadata.

https://github.com/apache/iceberg/issues/8338

How to reproduce it

  1. Create table
  2. Add data
  3. Run GC with GC command
  4. Try to read all data from table again

Nessie server type (docker/uber-jar/built from source) and version

kubernetes 0.99.0

Client type (Ex: UI/Spark/pynessie ...) and version

Spark

Additional information

No response

dorsegal avatar Oct 13 '24 22:10 dorsegal

Maybe this is related as well https://github.com/projectnessie/nessie/issues/8263

dorsegal avatar Oct 13 '24 22:10 dorsegal

@dorsegal can you provide more details, or better a reproducer?

snazy avatar Oct 14 '24 07:10 snazy

My setup is with kafka iceberg connect. create a table and use kafka iceberg sink to write data into the table. After several commits run GC and try to run rewrite files using spark.

spark.sql( "CALL nessie.system.rewrite_data_files(table => 'table', options => map('partial-progress.enabled','true', 'max-concurrent-file-group-rewrites', '30'))" ).show()

I can provide more logs if needed just don't know which one. From GC logs I see that it deleted some files.

dorsegal avatar Oct 14 '24 08:10 dorsegal

What I meant is a full reproducer mentioning every step starting from scratch, so that s/o can get to the same behavior on a "clean"/empty environment.

snazy avatar Oct 14 '24 09:10 snazy

java -jar nessie-gc-0.99.0.jar gc  

failed:

Caused by: java.lang.RuntimeException: Failed to get manifest files for ICEBERG_TABLE robot_dev.robot_data, content-ID fc122060-bf21-44d3-b776-fbecb2d23715 at commit 47a35e9867b0408c65feb09d7140a29d198354edf3a1aa0dc7cc09d192b07c27 via s3://ice-lake/robot_dev/robot_data/metadata/00000-6b65b94d-6370-4c43-9baa-b40ed0770c5d.metadata.json

After expire snapshot in Spark SQL:

CALL nessie.system.expire_snapshots('nessie.robot_dev.robot_data', TIMESTAMP '2024-10-15 00:00:00.000', 1)

Count of snapshots reduced and manifest files have been deleted. But Nessie metadata maybe not sync the changes of snapshots

@snazy

yunlou11 avatar Oct 18 '24 02:10 yunlou11

@yunlou11 thanks for the information. But what are all the necessary steps to get to that error message? Aka, what was all done before running GC?

snazy avatar Nov 04 '24 17:11 snazy

@yunlou11 thanks for the information. But what are all the necessary steps to get to that error message? Aka, what was all done before running GC?

Sorry, It's maybe Iceberg error, not nessie:

spark.sql("call demo.system.remove_orphan_files(table => 'xxx').show()")

iceberg issue: https://github.com/apache/iceberg/pull/7914

yunlou11 avatar Nov 07 '24 02:11 yunlou11

@yunlou11 thanks for the information. But what are all the necessary steps to get to that error message? Aka, what was all done before running GC?

Sorry, It's maybe Iceberg error, not nessie:

spark.sql("call demo.system.remove_orphan_files(table => 'xxx').show()")

iceberg issue: apache/iceberg#7914

When I fixed iceberg remove_orphan_files s3 Exception followed the suggestion "Use SupportsPrefixOperations for Remove OrphanFile Procedure", I get the nessie-gc failure but It's bug of the fixed codes about iceberg , not nessie

yunlou11 avatar Nov 07 '24 02:11 yunlou11

Was this ever resolved? @dorsegal how did you fix the missing file reference?

justas200 avatar Dec 05 '24 09:12 justas200

@justas200 I didn't manage to fix it. I had to recreate Nessie catalog. It was painful. I am to afraid to test it again as I don't want to loose data.

I didn't had the time to create a test env to reproduce and write it here. But when it happened to me this is what I did Create iceberg table using Spark. Insert data to Iceberg table Run GC with GC command Use spark read ALL data.

dorsegal avatar Dec 05 '24 11:12 dorsegal

Hi, any update on this ? Seems to happen quite frequently in the following scenario:

  • Kafka to Iceberg (with Kafka connector)
  • Periodically call to nessie.system.rewrite_data_files(...)
  • Then call nessie-gc

lome avatar Apr 03 '25 16:04 lome

Hi, any update on this ? Seems to happen quite frequently in the following scenario:

  • Kafka to Iceberg (with Kafka connector)
  • Periodically call to nessie.system.rewrite_data_files(...)
  • Then call nessie-gc

I have the same setup. Never used the GC ever since...

dorsegal avatar Apr 03 '25 17:04 dorsegal

@dorsegal FYI, looks like adding the max-file-modification parameter to the GC call may solve also your problem, from what I was able to understand file being written while gc was running were marked as deletable even if they are not

--max-file-modification="$(date -d '1 hours ago' +"%Y-%m-%dT%H:%M:%SZ")"

see also https://github.com/projectnessie/nessie/issues/10046

lome avatar Apr 08 '25 14:04 lome

In case anyone comes across this issue and they lost actual parquet (or orc, etc) files, what I did was create a table with the same name and schema in the dev environment (or another schema should work) and inserted a single row with all null/whatever values (important that they don't match anything in the table you're fixing) into that table. Then, each time the query failed, I copied that parquet file from the dev environment to the path that the error was complaining about (ex. s3a://iceberg/logs/events_b7bb2c4b-c131-4745-b5cd-a7a71d305846/data/0000/0101/0011/10100000/ts_day=2025-05-16/kafka_partition=4/00001-1747379360621-4f80847e-ee2d-47b5-afc6-cd220fd1d36a-00002.parquet) until it didn't fail anymore. Then I ran a delete on the null/whatever values that I had used. This allowed me to get past query failures, albeit with data loss. Avoiding data loss would have been slightly more complex but possible - launch a new Kafka Connect consumer group for the topic that points at a temporary table, then use that temporary table to overwrite the partitions where there was data loss. Seems like that would have been necessary had I lost metadata files instead.

agrubb86 avatar May 19 '25 17:05 agrubb86