couchdb icon indicating copy to clipboard operation
couchdb copied to clipboard

Handle SIGTERM on CouchDB Cluster

Open fsalazarh opened this issue 2 years ago • 0 comments

I am stress testing for a high availability environment of a Couch cluster on a Kubernetes cluster. The issue I am currently seeing is that when a Couch cluster node (kubernetes pod) is removed, in the time it is being removed and recreated, a series of 5xx errors are thrown for requests that are alive in that interval. Going a little deeper, I am trying to understand if Couch handles some mechanism for detecting the SIGTERM signal from the main process, in order to divert the live connections in the node that is going to be recreated to other nodes in the cluster and avoid receiving 5xx errors.

Is there a mechanism to handle SIGTERM signals in a Couch cluster? If not, it would be an interesting feature that would allow us to have more control to avoid losing connections at the time of node loss.

fsalazarh avatar Sep 12 '22 13:09 fsalazarh