lakeFS
lakeFS copied to clipboard
KV Graveler: Staging Area compaction
Graveler KV implementation introduces branch staging area composed of staging token and a list of sealed tokens. Potentially the sealed tokens list can become quite large and affect performance of reading from staging area. To prevent performance degradation, we suggest performing compaction as a background operation.
Mechanism
- Define threshold for branch sealed tokens list size
- Compaction mechanism to either triggered as a post operation explicitly in the background or as a background service
- The compaction mechanism will take the branch staging area and build it on a single token which will replace the current staging area
- Compaction of branch staging area will be an atomic operation, that will either succeed completely or fail without repercussions
- The operation can be retried at any given time