couchdb
couchdb copied to clipboard
Handle SIGTERM on CouchDB Cluster
I am stress testing for a high availability environment of a Couch cluster on a Kubernetes cluster. The issue I am currently seeing is that when a Couch cluster node (kubernetes pod) is removed, in the time it is being removed and recreated, a series of 5xx errors are thrown for requests that are alive in that interval. Going a little deeper, I am trying to understand if Couch handles some mechanism for detecting the SIGTERM signal from the main process, in order to divert the live connections in the node that is going to be recreated to other nodes in the cluster and avoid receiving 5xx errors.
Is there a mechanism to handle SIGTERM signals in a Couch cluster? If not, it would be an interesting feature that would allow us to have more control to avoid losing connections at the time of node loss.