paimon
paimon copied to clipboard
[Bug] lastOptimizedCompaction in UniversalCompaction doesn't work as expected
Search before asking
- [x] I searched in the issues and found nothing similar.
Paimon version
master
Compute Engine
Flink 1.18
Minimal reproduce step
- create a paimon table with optimization interval and enable debug log
CREATE TABLE IF NOT EXISTS table_a (
task_id varchar,
`value` bigint,
metric_name varchar,
PRIMARY KEY (`torrent_task_id`) NOT ENFORCED
) WITH (
'merge-engine' = 'partial-update',
'changelog-producer' = 'lookup',
'bucket' = '2',
'compaction.optimization-interval' = '24 h'
);
insert into table_a
select metric_name,`value`,task_id from kafka_table_b;
- producing records to kafka
- produce records to kafka when the first checkpoint is triggered.
- don't produce any records for the next several checkpoints
- reproduce records to kafka
4.we'll see logs like that
DEBUG org.apache.paimon.operation.AbstractFileStoreWrite [] - Closing writer for partition org.apache.paimon.data.BinaryRow@9c67b85d, bucket 32. Writer's last modified identifier is 3300, while current commit identifier is 3302.
DEBUG org.apache.paimon.operation.AbstractFileStoreWrite [] - Creating writer for partition org.apache.paimon.data.BinaryRow@9c67b85d, bucket 32
DEBUG org.apache.paimon.mergetree.compact.UniversalCompaction [] - Universal compaction due to optimized compaction interval
What doesn't meet your expectations?
Whenever MergeTreeWriter is recreated, as lastOptimizedCompaction is null in UniversalCompaction instance, full compaction will be triggered even if it is triggered before.
I think lastOptimizedCompaction may need to be persisted or retrieved from the file list, instead of being null all the time when UniversalCompaction instance is recreated.
Anything else?
No response
Are you willing to submit a PR?
- [x] I'm willing to submit a PR!