elasticsearch
elasticsearch copied to clipboard
Treat docs with recovery_source as deletes in merges
Currently, we do not account the number of documents with 'recovery_source' ready to drop when selecting merge specifications. Previously, a single large segment containing 'recovery_source' documents was considered fully merged, even though Elasticsearch should have triggered a merge to remove them.
With this PR, we're adjusting the merge policy to treat documents with 'recovery_source' ready to drop as deletions when determining merge specifications. Essentially, documents with 'recovery_source' are now treated as soft-deleted by the Elasticsearch merge policy.
We will need a follow-up to trigger merges when the retention leases advance enough to drop soft-deletes and recovery_source.