elasticsearch icon indicating copy to clipboard operation
elasticsearch copied to clipboard

Treat docs with recovery_source as deletes in merges

Open dnhatn opened this issue 9 months ago • 0 comments

Currently, we do not account the number of documents with 'recovery_source' ready to drop when selecting merge specifications. Previously, a single large segment containing 'recovery_source' documents was considered fully merged, even though Elasticsearch should have triggered a merge to remove them.

With this PR, we're adjusting the merge policy to treat documents with 'recovery_source' ready to drop as deletions when determining merge specifications. Essentially, documents with 'recovery_source' are now treated as soft-deleted by the Elasticsearch merge policy.

We will need a follow-up to trigger merges when the retention leases advance enough to drop soft-deletes and recovery_source.

dnhatn avatar Apr 27 '24 03:04 dnhatn