[Feature] Support periodically refresh existing dataFileMetas in CompactManager
Search before asking
- [x] I searched in the issues and found nothing similar.
Motivation
It is very usual user will use a streaming job and a batch job to update different columns of the same paimon table. User can choose to configure the batch job as write-only or streaming job as write-only to avoid conflict.
In either way, the write and compact job will not receive the new files generated by write-only job. This will lead to insufficient compaction both for MergeTreeCompactManager and BucketedAppendCompactManager.
This issue could be addressed by supporting periodically refresh dataFileMetas in CompactManager.
By doing this, there is no need to introduce an extra dedicated compaction job.
Solution
Same solution like the refreshFiles in LocalTableQuery
Anything else?
No response
Are you willing to submit a PR?
- [x] I'm willing to submit a PR!
Hi, can I try it?
@xiedeyantu thx for volunteering, I'm already working on this.
@xiedeyantu thx for volunteering, I'm already working on this.
OK, thanks for reply, and there are some easy issue to try?
@xiedeyantu thx for volunteering, I'm already working on this.
OK, thanks for reply, and there are some easy issue to try?
Maybe you can try this. https://github.com/apache/paimon/issues/4244