iceberg icon indicating copy to clipboard operation
iceberg copied to clipboard

Cannot commit, found new delete for replaced data file

Open zhushanwei opened this issue 2 years ago • 4 comments

Query engine

iceberg-flink-runtime-1.14-0.14.0.jar,help me,Thanks

Question

org.apache.iceberg.exceptions.ValidationException: Cannot commit, found new delete for replaced data file: GenericDataFile{content=data, file_path=hdfs://dev-001:8020/iceberg/flink_hive_iceberg/flink_hive_db.db/test_repository_1/data/news_postdate=2022-07-31/00002-0-8b3590ea-a593-4734-b84a-a6084a426b95-00093.parquet, file_format= PARQUET, spec_id=0, partition=PartitionData{news_postdate=2022-07-31}, record_count=106, file_size_in_bytes=110049, column_sizes=null, value_counts=null, null_value_co unts=null, nan_value_counts=null, lower_bounds=null, upper_bounds=null, key_metadata=null, split_offsets=[4], equality_ids=null, sort_order_id=null} at org.apache.iceberg.exceptions.ValidationException.check(ValidationException.java:50) at org.apache.iceberg.MergingSnapshotProducer.validateNoNewDeletesForDataFiles(MergingSnapshotProducer.java:418) at org.apache.iceberg.MergingSnapshotProducer.validateNoNewDeletesForDataFiles(MergingSnapshotProducer.java:367) at org.apache.iceberg.BaseRewriteFiles.validate(BaseRewriteFiles.java:108) at org.apache.iceberg.SnapshotProducer.apply(SnapshotProducer.java:175) at org.apache.iceberg.SnapshotProducer.lambda$commit$2(SnapshotProducer.java:296) at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:404) at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:214) at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:198) at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:190) at org.apache.iceberg.SnapshotProducer.commit(SnapshotProducer.java:295) at org.apache.iceberg.actions.BaseSnapshotUpdateAction.commit(BaseSnapshotUpdateAction.java:41) at org.apache.iceberg.actions.BaseRewriteDataFilesAction.doReplace(BaseRewriteDataFilesAction.java:298) at org.apache.iceberg.actions.BaseRewriteDataFilesAction.replaceDataFiles(BaseRewriteDataFilesAction.java:277) at org.apache.iceberg.actions.BaseRewriteDataFilesAction.execute(BaseRewriteDataFilesAction.java:252)

zhushanwei avatar Aug 01 '22 10:08 zhushanwei

I used the upsert mode

zhushanwei avatar Aug 01 '22 10:08 zhushanwei

The reason is Iceberg find some delete files associated with the data file that you want to rewrite. Can you show me your case?

hzluting avatar Aug 02 '22 07:08 hzluting

The reason is Iceberg find some delete files associated with the data file that you want to rewrite. Can you show me your case?

//writer data Configuration conf = new Configuration(); Map<String, String> properties = new HashMap<>(); CatalogLoader catalogLoader = CatalogLoader.hive(Constants.CATALOG_NAME, conf, properties); TableIdentifier tableIdentifier = TableIdentifier.of(Constants.DATABASE_NAME, Constants.TABLE_NAME); TableLoader tableLoader = TableLoader.fromCatalog(catalogLoader, tableIdentifier); FlinkSink.forRowData(input) .writeParallelism(parallelism) .tableLoader(tableLoader) .upsert(true) .overwrite(false) .append();

//rewriteDataFiles StreamExecutionEnvironment env = FlinkEnvironment.getEnvironment(parallelism); tableLoader.open(); Table table = tableLoader.loadTable();

Actions.forTable(env, table) .rewriteDataFiles() .maxParallelism(parallelism) .targetSizeInBytes(256 * 1024 * 1024) .filter(Expressions.equal("news_postdate", newsPostdate)) .execute(); Is there any other way to solve this problem?Thanks

zhushanwei avatar Aug 02 '22 10:08 zhushanwei

I also met this probelm in the same case. It's not "some delete files associated with the data file" casue this problem. Add log in the tail of https://github.com/apache/iceberg/blob/5a15efc070ab59eeda6343998aa065c0c9892c5c/core/src/main/java/org/apache/iceberg/DeleteFileIndex.java#L151 to print the data file path, delete file path, lower and upper. And you can see the upper and lower filepath info is not complete filepath, but truncate 16 bit. This can lead to false positives when determining whether a data file references a deleted file. From the source code https://github.com/apache/iceberg/blob/5a15efc070ab59eeda6343998aa065c0c9892c5c/core/src/main/java/org/apache/iceberg/MetricsConfig.java#L52 you can see the DEFAULT_WRITE_METRICS_MODE_DEFAULT is truncate(16). The upper and lower information of the filepath was intercepted when the data file was generated, which lead to the misjudgment when commit in rewrite data. To resolve this problem, add a property like this when create table. alter table iceberg_table set tblproperties ( 'write.metadata.metrics.default'='full' );

Shane-Yu avatar Aug 12 '22 08:08 Shane-Yu

I also met this probelm in the same case. It's not "some delete files associated with the data file" casue this problem. Add log in the tail of

https://github.com/apache/iceberg/blob/5a15efc070ab59eeda6343998aa065c0c9892c5c/core/src/main/java/org/apache/iceberg/DeleteFileIndex.java#L151

to print the data file path, delete file path, lower and upper. And you can see the upper and lower filepath info is not complete filepath, but truncate 16 bit. This can lead to false positives when determining whether a data file references a deleted file. From the source code https://github.com/apache/iceberg/blob/5a15efc070ab59eeda6343998aa065c0c9892c5c/core/src/main/java/org/apache/iceberg/MetricsConfig.java#L52

you can see the DEFAULT_WRITE_METRICS_MODE_DEFAULT is truncate(16). The upper and lower information of the filepath was intercepted when the data file was generated, which lead to the misjudgment when commit in rewrite data. To resolve this problem, add a property like this when create table. alter table iceberg_table set tblproperties ( 'write.etadata.metrics.default'='full' );

Yes, you are right! I'm wondering why I can't reproduce this problem. But the property key is write.metadata.metrics.default

hzluting avatar Aug 12 '22 09:08 hzluting

Is this problem solved? i met the same error

chenwyi2 avatar Sep 22 '22 09:09 chenwyi2

Is this problem solved? i met the same error

create table: 'write.distribution-mode'='hash', 'commit.manifest.min-count-to-merge'='2', 'format-version'='2', 'write.upsert.enable'='true', 'write.metadata.metrics.default'='full', 'write.metadata.delete-after-commit.enabled'='true', 'write.metadata.previous-versions-max'='1'

writer data: FlinkSink.forRowData(input) .writeParallelism(parallelism) .tableLoader(tableLoader) .overwrite(false) .append();

I used these configurations,The result is normal

zhushanwei avatar Oct 17 '22 10:10 zhushanwei

@RussellSpitzer @stevenzwu This should be a bug. Is it possible to solve it by default setting the metrics of the file_path column to full?

lintingbin avatar Nov 01 '22 07:11 lintingbin

@Shane-Yu i add 'write.metadata.metrics.default'='full' into table ,and i print the log messae with upper, "upper java.nio.HeapByteBuffer[pos=0 lim=16 cap=16],fromByteBuffer qbfs://online010", "pos=0 lim=16 cap=16" is still truncate 16 bit? it doesn't work?

chenwyi2 avatar Nov 15 '22 10:11 chenwyi2

'write.distribution-mode'='hash', 'commit.manifest.min-count-to-merge'='2', 'format-version'='2', 'write.upsert.enable'='true', 'write.metadata.metrics.default'='full', 'write.metadata.delete-after-commit.enabled'='true', 'write.metadata.previous-versions-max'='1'

Your configurations 'write.upsert.enable'='true' is wrong,so your mode not using upsert-enabled. Do you have another solution method?Anyway, thanks!

skywalker2256 avatar Apr 04 '23 11:04 skywalker2256