paimon
paimon copied to clipboard
[flink] Small changelog files can now be compacted into big files
Purpose
Currently, changelog files are not compacted. If Flink's checkpoint interval is short (for example, 30 seconds) and the number of buckets is large, each snapshot may produce lots of small changelog files. Too many files may put a burden on the distributed storage cluster.
This PR introduces a new feature to compact small changelog files into large ones.
Tests
IT cases.
API and Format
Introduces a special file format for compacted changelogs.
Documentation
Document is also added.