[core] Introduce level0FileCount for partitions table
Purpose
Linked issue: close #xxx
Tests
PartitionsTableTest#testLevel0FileCountValue
API and Format
Documentation
Thanks @MonsterChenzhuo for the contribution.
But what is usage of level0file?
Thanks @MonsterChenzhuo for the contribution.
But what is usage of level0file?
In the scenario where a table has the DelVector enabled, users can quickly determine whether data has been written and whether compaction has been completed by checking the level0file, especially when there is no data found for the current partition
However, if you use $files, the results may not be intuitive. This often requires users to perform aggregation to interpret the results.
Thanks @MonsterChenzhuo for the contribution. But what is usage of level0file?
In the scenario where a table has the DelVector enabled, users can quickly determine whether data has been written and whether compaction has been completed by checking the level0file, especially when there is no data found for the current partition
However, if you use $files, the results may not be intuitive. This often requires users to perform aggregation to interpret the results.
But this is depends on per bucket? We should know the maxLevel0FilesInBucket and avgLevel0FilesInBucket, maybe it is better to just let them in metrics.
Thanks @MonsterChenzhuo for the contribution. But what is usage of level0file?
In the scenario where a table has the DelVector enabled, users can quickly determine whether data has been written and whether compaction has been completed by checking the level0file, especially when there is no data found for the current partition However, if you use $files, the results may not be intuitive. This often requires users to perform aggregation to interpret the results.
But this is depends on per bucket? We should know the maxLevel0FilesInBucket and avgLevel0FilesInBucket, maybe it is better to just let them in metrics.
maxLevel0FilesInBucket and avgLevel0FilesInBucket,
For real-time writes to the Paimon table, we use real-time compaction and collect metrics to monitor maxLevel0FilesInBucket and avgLevel0FilesInBucket. However, for scenarios with infrequent updates (such as T+1) that require high throughput and low consumption, using offline compaction to monitor the number of L0 files through metrics feels less convenient compared to using system tables.
There is an operational path as follows:
Check the system table to see if there are any L0 data remaining in the partition:
SELECT * FROM default.T$partitions;
If there are, use an SQL stored procedure to execute compaction:
CALL sys.compaction(table => default.T);
It seems a specific usage, let's wait future requirements.