elasticsearch icon indicating copy to clipboard operation
elasticsearch copied to clipboard

[Transform] transform can fail without retry when source index gets closed by ILM unfollow

Open valeriy42 opened this issue 2 years ago • 1 comments

Affected version: 7.12-, 8.0- Fixed in:

Problem Description

A continuous transform can fail if a transform is searching at the same time as an ILM action occurs or in general anything that closes an index. This should not happen if transform is configured with a wildcard pattern for the source index, e.g. filebeat-*.

In a specific case an ILM policy temporarily closes the index due to the unfollow-the-follower action.

A typical log entry for this bugs contains something like this:

task encountered irrecoverable failure: org.elasticsearch.cluster.block.ClusterBlockException: index [ ... ] blocked by: [FORBIDDEN/4/index closed];

Mitigation

A: Restart the transform. This can be automated by monitoring _stats and restarting the transform using _stop?force=true followed by _start.

B: Don't use the ILM unfollow action until a fix is available.

C: Unreleased: Starting with 8.5 set the transform to unattended mode via settings, this will let transform retry even for this failure class

Solution

If the source of a transform is configured with a wildcard, transform should not treat an index closed exception as an irrecoverable failure.

Backport

The fix needs to be backported to the 7.17 series.

valeriy42 avatar Sep 05 '22 12:09 valeriy42

Pinging @elastic/ml-core (Team:ML)

elasticsearchmachine avatar Sep 05 '22 12:09 elasticsearchmachine