orchestrator icon indicating copy to clipboard operation
orchestrator copied to clipboard

Failure from hooks are not reflected on UI

Open serhatcetinkaya opened this issue 2 years ago • 1 comments

Hello,

Recently we had a scenario where we did a graceful master takeover, all the processes were okay except the PostMasterFailoverProcesses step. The command we had there (which is the last step to update the proxy on our end after takeover) exited with non-zero code. So, on orchestrator UI we saw everything was successful and master was replaced with new one. However since the hook command failed our proxy didn't get updated, the application kept writing to old master and we had to fix it later manually, which is okay for this case on our side.

The annoying thing is we didn't see any alerts or notifications on orchestrator UI during the master takeover operation. If we had seen an error or warning there stating the PostMasterFailoverProcesses failed we would've taken the action faster.

I checked the code a little bit, when we drag a slave in front of the master and trigger a master takeover:

  • after hitting ok in the prompt, it calls an api endpoint: https://github.com/openark/orchestrator/blob/v3.1.4/resources/public/js/cluster.js#L1083
  • then it checks the operationResult.responseJSON.Code from response and if ERROR it shows an alert: https://github.com/openark/orchestrator/blob/v3.1.4/resources/public/js/orchestrator.js#L171-L173
  • if we check the http package, we see that it triggers logic.GracefulMasterTakeover() and checks any error returned from that func: https://github.com/openark/orchestrator/blob/master/go/http/api.go#L3178
  • from the logic package when I checked the function I see that just before returning it runs post processes: https://github.com/openark/orchestrator/blob/master/go/logic/topology_recovery.go#L2224
  • function signature of the executeProcesses() states that it returns error

in this flow the only problem I see is logic.GracefulMasterTakeover() calls executeProcesses() but doesn't check any errors returned. If we have this implemented in the code we would see an alert in the UI when post processes fail and the scenario above can be avoided easily.

If you think it makes sense, I will prepare a PR for this.

Orchestrator version: 3.1.4

serhatcetinkaya avatar Feb 13 '23 12:02 serhatcetinkaya

PR's won't be merged. This project is not maintained at this time. Better to form a FORK and DIY....

voarsh2 avatar Mar 07 '23 13:03 voarsh2