dotnet-operator-sdk icon indicating copy to clipboard operation
dotnet-operator-sdk copied to clipboard

[feature]: Exponential backoff for entity controllers

Open ian-buse opened this issue 1 year ago • 0 comments

Is your feature request related to a problem? Please describe.

If an entity controller throws an exception, the entity does not get re-queued and is effectively dead until some event causes another reconciliation. Besides catching the exceptions, the operator SDK does not have any built-in support for error handing from entity controllers.

This means that all exception/error handling has to be done within the controller, and there are limitations to this approach, especially if you want an exponential backoff (which is common in Kubernetes land).

Describe the solution you would like

First, I would like to be able to configure the backoff with a few parameters, such as MaxRetries, MinDuration, and MaxDuration. I think this could be done in the OperatorSettings. Maybe people would want this configurable per-entity though? I don't know.

As for the controllers themselves, there are a few approaches I thought might work:

  • Use a custom exception that can be caught by the SDK to signal that the resource failed to reconcile.
  • Instead of the controllers returning a Task, they could return a Task<EntityControllerResult> or something like that, in some ways similar to how the v7 operator worked.
  • Maybe the EntityRequeue<TEntity>(TEntity, TimeSpan) delegate could be changed?

Additional Context

No response

ian-buse avatar Jul 02 '24 23:07 ian-buse