restate icon indicating copy to clipboard operation
restate copied to clipboard

Don't panic in case of failures but rather retry and report health status

Open tillrohrmann opened this issue 1 year ago • 0 comments

Since a Node can run multiple Restate components it is no longer a good idea to panic on errors occurring in one component. The problem with this approach is that one failing component will drag all other components down with it. Instead, we want to change the default behavior to retrying a failed operation and additionally report the retrying as part of the health status of this component. Based on this health status, the cluster controller can make global decisions about which components to shut down or migrate.

tillrohrmann avatar Mar 06 '24 10:03 tillrohrmann