uhaha icon indicating copy to clipboard operation
uhaha copied to clipboard

gracefully shutdown and transfer leadership when before shutting down

Open octu0 opened this issue 3 years ago • 0 comments

what do we want to fix?

  1. to safely remove the server in case of server maintenance or predictive failure
  2. when cluster removing a server, if it is configured as a leader, we want to transfer it and then terminate it.
  3. to shutdown server gracefully with proper signal handling.
  4. ip address of server changes dynamically by DHCP (including when restart), so we want to leave cluster when server is terminated.
  5. we want to rejoin same cluster when server is restart for some reason, or when it comes back from maintenance.

about this modification

  • Context.Done() added as reference so that server can be stopped after all responses have been returned.
  • to integrate with Supervisor systems(e.g. daemontools, docker), defined signals in ServerShutdownSignals to be able to shutdown server. by default, shutdown will be performed when syscall.SIGINT or syscall.SIGTERM signal is received.
  • add LeadershipTransferSignals to Config so that leader node can be transferred leadership to another node in case of Leader node. by default, transfered when SIGUSR2 signal is received. this can be used to safely shutdown server for maintenance.
  • even when server shutdown, leader node will transfer leadership.
  • if server shutdown, heartbeat will no longer needed, so server will shutdown after RemoveServer.
  • internally, due to use of RAFT SERVER REMOVE command for leader node, the Context for signal handling and the Context for Redis management are separated.

octu0 avatar Dec 23 '21 13:12 octu0