xraft icon indicating copy to clipboard operation
xraft copied to clipboard

Duplicate vote response can make illegal leader without a quorum

Open Icysandwich opened this issue 2 years ago • 7 comments

I notice that the message RequestVoteResult does not contains a field for distinguishing which node the result is from. Besides, when processing RequestVoteResult, the node only checks the count of votes. It means a node can become leader as long as he receives enough granted votes, rather than is supported by enough nodes.

Considering a scenario with 5 nodes, Node 1 becomes Candidate and request vote for Node 2. Node 2 responses two duplicate RequestVoteResult messages (due to network error or some other faults). Then Node 1 can become Leader even when he only receives one node's support!

I think it is a bug.

Icysandwich avatar Mar 23 '22 03:03 Icysandwich

Server id is not included in response because it is not necessary. When we create a connection from current node to another node, we are sure the response from that node belongs to the the node we are connecting.

If you are talking about the incorrect or duplicated messages, please consider a scenario in what circumstance the node could send duplicated messages. Normally, if you resend something, it means you are waiting for the response and retrying. There is no retrying mechanism from the follower side when sending the RequestVoteResponse, if the message cannot reach the other node, it just is being dropped or ignored. Raft is building on the unreliable and asynchronous-model network to solve the fundamental consensus problem in distributed systems so there is no stable message exchange protocol and Raft is not going to create one.

However, if you really want to maintain correctness between two nodes, you can add some check in the connection handler, which should be transparency to Raft side.

xnnyygn avatar Mar 23 '22 05:03 xnnyygn

Sorry for my misleading expression. I agree that in my scenario Node 2 cannot response twice for only one request. But for me it is still possible that Node 1 can receive duplicate responses even when Node 2 only sends it once due to OS- or network-level error. I find that the candidate keeps all channels open during leader election until it becomes the leader. Thus, it can always receive messages from the channel, also including a duplicate response.

Besides, Raft is required to handle with duplicate message. You can find the related statements in Dr. Ongaro's thesis, section 8.1. Also, the given TLA+ specification defines DuplicateMessage behavior which Raft must cover.

Icysandwich avatar Mar 23 '22 09:03 Icysandwich

Thank you for pointing out the necessity of handling unintentional duplicated messages. A quick thought is to check if the node has voted for a specific term in the connection handler. e.g.

int lastTermVoted = -1;
if(requestVoteResult.term != lastTermVoted) {
  lastTermVoted = requestVoteResult.term;
  // propagate the message
}
// otherwise just ignore it

I'll approve it if you create a pull request for this.

xnnyygn avatar Mar 24 '22 01:03 xnnyygn

I dont think this simple fix can help solve the problem, since lastTermVoted cant still be used to distinguish which node the vote is from.

I've created a PR with a more complex revision. Please check if it is appropriate.

Icysandwich avatar Mar 28 '22 10:03 Icysandwich

I find another scenario to prove why it is important to distinguish where votes from. PLZ check the test case testVoteCanceled in the lastest commit.

Icysandwich avatar Mar 29 '22 07:03 Icysandwich

Thank you.

I checked the testVoteCancelled test. I consider it might not be the case that would happen. Replying two different results means it received at least two messages from the candidate. According to the Raft algorithm, if you have voted for Node X, you shouldn't vote for another Node since you have saved your last choice. This is still true even you restart the node intentionally because the choice is saved before you reply. And if you accidentally receives duplicated requests for voting from the same node, the result must be the same.

The only chance that a node will reply more than one response is that a brain-split happens, two candidates ask the same node for voting. Then the node replies to them respectively. In that case, the messages go to different nodes, not the same one.

xnnyygn avatar Mar 29 '22 08:03 xnnyygn

Yep. The standard Raft will persist important internal states, i.e., votedFor and term, on disk when they are changed, so that a node could remember his choice even after restart. But in some implementations we cannot perform persist operations due to some practical reasons. So nodes can fotget these volatile states in memory when a sudden crash happens.

Icysandwich avatar Mar 29 '22 08:03 Icysandwich