amoro icon indicating copy to clipboard operation
amoro copied to clipboard

[Improvement]: Splitting Commits in Large-Scale Self-Optimizing Processes to Reduce Writing Conflicts

Open cxxiii opened this issue 5 months ago • 1 comments

Search before asking

  • [x] I have searched in the issues and found no similar issues.

What would you like to be improved?

If a self-optimizing process compacts a large amount of files, the commit of the compaction results may become time-consuming, which can eventually lead to conflicts with real-time writing tasks. In such scenarios, it is necessary to split the commit of a large-scale self-optimizing process to avoid excessively long commit durations.

How should we improve?

The proposed changes are detailed in the AIP document.

Are you willing to submit PR?

  • [x] Yes I am willing to submit a PR!

Subtasks

No response

Code of Conduct

cxxiii avatar Jul 23 '25 06:07 cxxiii

A partial submission feature similar to spark-procedure can be used, and we also plan to implement this feature.

wardlican avatar Aug 22 '25 07:08 wardlican