amoro icon indicating copy to clipboard operation
amoro copied to clipboard

[Improvement]: Optimize the commit process of commit snapshots for UnKeyedTable optimizing

Open hameizi opened this issue 2 years ago • 2 comments

Search before asking

  • [X] I have searched in the issues and found no similar issues.

What would you like to be improved?

Before version 1.3.x of Iceberg, the RewriteFiles interface did not provide methods for specifying sequence numbers while adding and deleting delete files. Therefore, in UnKeyedTableCommit, we had to complete table rewrite through transactional submission and an additional snapshot of deleting files. However, Iceberg version 1.3.x provides a more flexible interface design to improve this, and we can complete the rewrite in one submission through the following approach:

RewriteFiles dataFileRewrite = icebergTable.newRewrite();
addDeleteFiles.forEach(dataFileRewrite::addFile);
removedDataFiles.forEach(dataFileRewrite::deleteFile);
addDeleteFiles.forEach(dataFileRewrite::addFile);
removedDeleteFiles.forEach(dataFileRewrite::deleteFile);
dataFileRewrite.dataSequenceNumber(sequenceNumber);
dataFileRewrite.commit();

How should we improve?

No response

Are you willing to submit PR?

  • [ ] Yes I am willing to submit a PR!

Subtasks

No response

Code of Conduct

hameizi avatar Aug 10 '23 12:08 hameizi

cc @861752346

hameizi avatar Aug 10 '23 12:08 hameizi

The issue has not be resolved, right?

image

tcodehuber avatar Dec 13 '23 03:12 tcodehuber

This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.

github-actions[bot] avatar Aug 21 '24 00:08 github-actions[bot]

This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale'

github-actions[bot] avatar Sep 04 '24 00:09 github-actions[bot]