raft
raft copied to clipboard
Followers aren't using higher terms from request vote rpcs
I saw these test failures on CI:
01:13:22.264 [error] Process :s5 (#PID<0.559.0>) terminating
** (FunctionClauseError) no function clause matching in Raft.Server.voted_for_someone_else?/2
(raft) lib/raft/server.ex:491: Raft.Server.voted_for_someone_else?(%Raft.RPC.RequestVoteReq{candidate_id: {:s4, :nonode@nohost}, from: {:s4, :nonode@nohost}, last_log_index: 166, last_log_term: 3, term: 4, to: {:s5, :nonode@nohost}}, %Raft.Log.Metadata{term: 3, voted_for: {:s5, :nonode@nohost}})
(raft) lib/raft/server.ex:476: Raft.Server.vote_granted?/3
(raft) lib/raft/server.ex:458: Raft.Server.handle_vote/3
(stdlib) gen_statem.erl:1240: :gen_statem.call_state_function/5
(stdlib) gen_statem.erl:1012: :gen_statem.loop_event/6
(stdlib) proc_lib.erl:247: :proc_lib.init_p_do_apply/3
Initial Call: Raft.Server.init/1
Ancestors: [:s5_sup, Raft.Server.Supervisor, Raft.Supervisor, #PID<0.382.0>]
I haven't been able to re-create them locally yet but it looks like whats happening is that we're getting a request vote rpc call and we're not adopting the higher term. This causes the voted_for_someone_else?
call to fail because it matches against the term to enforce that we're always comparing terms between our internal state and the request. I'm not sure if we're potentially racing when talking to the log process (which would result in us returning the wrong term) or if something else is going on.