featurebase icon indicating copy to clipboard operation
featurebase copied to clipboard

NodeLeave gossip event reported when cluster under heavy load

Open tgruben opened this issue 6 years ago • 1 comments

Expected behavior

only report node leave when either node crashes or signals a departure from the cluster

Actual behavior

when cluster performing a long running operation > hours. gossip is determining that a node has died du to lack of response to probes.

Information about your environment (OS/architecture, CPU, RAM, cluster/solo, configuration, etc.)

3 node oci cluster 930 GB ram 144 cores 72 TB nvme storage

tgruben avatar Dec 20 '18 17:12 tgruben

Possible that increasing gossip.suspicion-mult will improve this behavior. Need to test.

jaffee avatar Apr 30 '19 14:04 jaffee