galaxy icon indicating copy to clipboard operation
galaxy copied to clipboard

ZooKeeper may think a node is dead before the node realizes

Open pron opened this issue 13 years ago • 1 comments

If ZK thinks a node is dead, the information may not spread to the cluster at once. The server may think the node has died and start serving its items to other nodes who've noticed the death, while the node itself thinks it's still alive and continue serving nodes that also think it's alive.

pron avatar Jul 09 '12 23:07 pron

Verifying with ZK that you're alive before serving other nodes' requests is impractical. What we should do, is set the ZK connection timeout on the client (node) to be less than that of the ZK timeout for determining node death, so the node itself will actually be more conservative and think it's dead (thus stop serving requests) before other nodes realize it. This OK, because eventually they all will.

pron avatar Jul 09 '12 23:07 pron