can comments

Results 17 comments of

can

[Bug]: Amoro optimization can result in the input files and the merge…

Referencing Spark/Iceberg's SizeBasedFileRewritePlanner, the number of output files is determined intelligently: | Condition | Decision | | ----------- | ----------- | | Remainder > minFileSize (default 0.75 * targetSize) |...

[Bug]: Amoro optimization can result in the input files and the merged output files having the same number of files, and this can cause the merge to fail and keep triggering the merge task.

Root Cause of the Problem In `IcebergRewriteExecutor.targetSize()`, when the total size of the input files is greater than or equal to `targetSize`, it returns `targetSize` (instead of `Long.MAX_VALUE`). This causes:...

[Bug]: Amoro optimization can result in the input files and the merged output files having the same number of files, and this can cause the merge to fail and keep triggering the merge task.

> I have some questions — if the issue occurred with the segment files, why is the input file size less than 1 MB? > Also, if the segment doesn’t...

[Bug]: Amoro optimization can result in the input files and the merged output files having the same number of files, and this can cause the merge to fail and keep triggering the merge task.

> [@wardlican](https://github.com/wardlican) I couldn't reproduce this scenario. I think the main issue is related to the data; the inaccurate calculation of the Writer's scroll size is causing this phenomenon. >...

[Bug]: Amoro optimization can result in the input files and the merged output files having the same number of files, and this can cause the merge to fail and keep triggering the merge task.

> > [@wardlican](https://github.com/wardlican) I couldn't reproduce this scenario. I think the main issue is related to the data; the inaccurate calculation of the Writer's scroll size is causing this phenomenon....

[Bug]: Amoro optimization can result in the input files and the merged output files having the same number of files, and this can cause the merge to fail and keep triggering the merge task.

This is a rather serious problem, as it can lead to endless table merging and a continuous expansion of metadata.

Add a registration function for table allocation in master-slave mode.

Please check the changes here.