index-management icon indicating copy to clipboard operation
index-management copied to clipboard

Default retries for ISM Actions

Open dbbaughe opened this issue 3 years ago • 0 comments

We have noticed sometimes users aren't fully aware there are options like retries for when an action fails and error notifications to be notified about it. Usually not the worst thing in the world, but when it comes to things like rollover, if this enters into a failed state and they do not have those configured an index can very quickly grow too large and cause cluster instabilities.

One potential solution for the transient errors at least is to have ISM set default retry configurations instead of none and the user would have to explicitly set the retry to 0 if they do not want it to be retried. This would protect against the transient failures for actions that could have consequences for not executing.

Another solution (or can both be done) is just a better UI to show what is allowed. Right now the JSON editor is not the most user friendly so it is probably hard for people to realize they can set these extra options. If we had an updated UI that had explicit areas for configuring retries and error notifications along with help text explaining why they should use it and a warning for some of these actions like rollover if they are not configured then this could also solve the issue. i.e. while configuring a rollup action on the UI we have a yellow warning box that says something like "If retries or not configured for rollover it's possible to fail and have the index grow too large" along with some warnings about lack of error notifications.

Will keep this open to get some other thoughts on it.

dbbaughe avatar Apr 02 '21 16:04 dbbaughe