paimon icon indicating copy to clipboard operation
paimon copied to clipboard

[Bug] After the flink job written to Paimon modifies `sink.parallelism`, the job will not be able to recover from the checkpoint

Open huyuanfeng2018 opened this issue 10 months ago • 3 comments

Search before asking

  • [X] I searched in the issues and found nothing similar.

Paimon version

master

Compute Engine

flink

Minimal reproduce step

  1. First set the parallelism degree of 1 to write Paimon

  2. Stop passing the current task

  3. Modify the parallelism to be greater than 1

  4. Restore from the last checkpoint

What doesn't meet your expectations?

The job should be able to resume normally from the last checkpoint or savepoint, even if I change the parallelism of the sink

Anything else?

No response

Are you willing to submit a PR?

  • [ ] I'm willing to submit a PR!

huyuanfeng2018 avatar Apr 18 '24 07:04 huyuanfeng2018

        SingleOutputStreamOperator<?> committed =
                written.rebalance().transform(
                                GLOBAL_COMMITTER_NAME,
                                new MultiTableCommittableTypeInfo(),
                                new CommitterOperator<>(
                                        streamingCheckpointEnabled,
                                        commitUser,
                                        createCommitterFactory(),
                                        createCommittableStateManager()))
                        .setParallelism(1)
                        .setMaxParallelism(1);

This can be avoided by adding rebalance before the commit operator, because the parallelism of commit is always 1. When the write parallelism is 1, flink will chain the two operators. When the parallelism of writing increases, will be split

huyuanfeng2018 avatar Apr 18 '24 07:04 huyuanfeng2018

@huyuanfeng2018 this is a good issue, but a chain committer can reduce resource cost.

maybe we can have an option to control this.

JingsongLi avatar Apr 30 '24 11:04 JingsongLi

@huyuanfeng2018 this is a good issue, but a chain committer can reduce resource cost.

maybe we can have an option to control this.

+1. Agree

huyuanfeng2018 avatar May 06 '24 02:05 huyuanfeng2018