elasticsearch
elasticsearch copied to clipboard
TransportMasterNodeAction may use stale cluster state for routing to current master
Different nodes may have different beliefs about the identity of the current elected master, and in particular there can be times when two nodes each think the other is the master. In that situation a TransportMasterNodeAction
will route the request back and forth between the nodes repeatedly until the node with the stale information is brought up to date. The response is then routed repeatedly back and forth between the nodes to unwind the request's path. This is all rather inefficient. It would be much better to include the sender's last-applied term in every MasterNodeRequest
and use this to avoid sending requests "back in time" to the master of a no-longer-current term.
Relates #102348
Pinging @elastic/es-distributed (Team:Distributed)
I've been looking into this some more and my first attempt to reproduce it in a test didn't work, but that was interesting because it led me to understand that this routing loop can only happen if a node becomes master while its locally applied state has a different master. Normally a node will apply a state with no master node first, and this is sufficient to prevent the routing loop.
That gives me an idea for a simpler fix: we should be able to make it so that we always apply the no-master state before publishing the election-winning state. I'm exploring this idea further.
That gives me an idea for a simpler fix
Changed my mind again, I don't think this approach is so simple after all. And I don't like that TransportMasterNodeAction
would be relying on such a delicate property of the cluster coordination subsystem. I've gone back to a preference for fixing this with some explicit checks in TransportMasterNodeAction
itself.
The production code changes to fix this in TransportMasterNodeAction
look to be fairly straightforward too. I'm just struggling to get a test which reproduces the problem.