paimon icon indicating copy to clipboard operation
paimon copied to clipboard

[Feature] pypaimon supports overwrite writes.

Open klboke opened this issue 4 months ago • 1 comments

Search before asking

  • [x] I searched in the issues and found nothing similar.

Motivation

In offline model inference and table writing scenarios, it is common to conduct multiple offline evaluation experiments. It is necessary to support overwrite writes so that the results from the last run are retained. Currently, this is not supported in the code. For example:

    def overwrite(self, partition, commit_messages: List[CommitMessage], commit_identifier: int):
        """Commit the given commit messages in overwrite mode."""
        raise RuntimeError("overwrite unsupported yet")

Solution

No response

Anything else?

No response

Are you willing to submit a PR?

  • [ ] I'm willing to submit a PR!

klboke avatar Sep 01 '25 00:09 klboke

There's an issue with the functionality. For example, when I use overwrite mode and end up writing 4 records, I expect the physical row count to be 4, but the actual physical row count equals the sum of the rows from my previous submissions, e.g.:

Image Image

klboke avatar Sep 10 '25 01:09 klboke