paimon icon indicating copy to clipboard operation
paimon copied to clipboard

[flink] Small changelog files can now be compacted into big files

Open tsreaper opened this issue 5 months ago • 2 comments

Purpose

Currently, changelog files are not compacted. If Flink's checkpoint interval is short (for example, 30 seconds) and the number of buckets is large, each snapshot may produce lots of small changelog files. Too many files may put a burden on the distributed storage cluster.

This PR introduces a new feature to compact small changelog files into large ones.

Tests

IT cases.

API and Format

Introduces a special file format for compacted changelogs.

Documentation

Document is also added.

tsreaper avatar Sep 25 '24 07:09 tsreaper