barge
barge copied to clipboard
Heartbeat messages causes followers to loop on same AppendEntry
While building an HTTP based demo of barge I noticed what I think might be a bug. Here are the (rough) steps to reproduce.
git pull foldlabs http_demo
mvn clean install
cd barge-jax-rs
java -jar target/barge-jax-rs.jar pack
# this creates a barge.sh script in current directory, not strictly necessary
# run several instances in different terminals
./barge.sh 0
./barge.sh 1
./barge.sh 2
# in a fourth terminal, init everything
./init.sh
./send http://localhost:56789 foo
In the terminal running the followers we can see logs for storing Entry in the journal:
11:59:15.395 [pool-1-thread-1] DEBUG org.robotninjas.barge.log.RaftLog - {http://localhost:56790/ 11 FOLLOWER} - http://localhost:56790/ storing Entry{command=[B@1c198fc5, term=11}
11:59:16.887 [pool-1-thread-1] DEBUG org.robotninjas.barge.log.RaftLog - {http://localhost:56790/ 11 FOLLOWER} - http://localhost:56790/ storing Entry{command=[B@2167179c, term=11}
11:59:18.393 [pool-1-thread-2] DEBUG org.robotninjas.barge.log.RaftLog - {http://localhost:56790/ 11 FOLLOWER} - http://localhost:56790/ storing Entry{command=[B@22a9014d, term=11}
11:59:19.897 [pool-1-thread-2] DEBUG org.robotninjas.barge.log.RaftLog - {http://localhost:56790/ 11 FOLLOWER} - http://localhost:56790/ storing Entry{command=[B@2ff4f32e, term=11}
11:59:21.391 [pool-1-thread-2] DEBUG org.robotninjas.barge.log.RaftLog - {http://localhost:56790/ 11 FOLLOWER} - http://localhost:56790/ storing Entry{command=[B@54074e95, term=11}
11:59:22.888 [pool-1-thread-2] DEBUG org.robotninjas.barge.log.RaftLog - {http://localhost:56790/ 11 FOLLOWER} - http://localhost:56790/ storing Entry{command=[B@302befad, term=11}
11:59:24.386 [pool-1-thread-2] DEBUG org.robotninjas.barge.log.RaftLog - {http://localhost:56790/ 11 FOLLOWER} - http://localhost:56790/ storing Entry{command=[B@2106c1fb, term=11}
11:59:25.888 [pool-1-thread-2] DEBUG org.robotninjas.barge.log.RaftLog - {http://localhost:56790/ 11 FOLLOWER} - http://localhost:56790/ storing Entry{command=[B@78bc5972, term=11}
In the leader:
12:00:22.854 [pool-1-thread-2] DEBUG org.robotninjas.barge.state.Leader - {http://localhost:56791/ 11 LEADER} - Sending heartbeat
12:00:22.855 [pool-1-thread-2] DEBUG o.r.barge.state.ReplicaManager - {http://localhost:56791/ 11 LEADER} - Sending update to http://localhost:56790/ prevLogIndex 2 prevLogTerm 9
12:00:22.863 [pool-1-thread-2] DEBUG o.r.barge.state.ReplicaManager - {http://localhost:56791/ 11 LEADER} - Sending update to http://localhost:56789/ prevLogIndex 2 prevLogTerm 9
12:00:22.893 [pool-1-thread-2] DEBUG o.r.barge.state.ReplicaManager - {http://localhost:56791/ 11 LEADER} - Response from http://localhost:56789/ Status true nextIndex 3, matchIndex 3
12:00:22.893 [pool-1-thread-2] DEBUG o.r.barge.state.ReplicaManager - {http://localhost:56791/ 11 LEADER} - Response from http://localhost:56789/ Status true nextIndex 3, matchIndex 3
12:00:22.893 [pool-1-thread-2] DEBUG org.robotninjas.barge.state.Leader - {http://localhost:56791/ 11 LEADER} - updating commitIndex to 3; sorted is [3, 3, 3]
12:00:22.898 [pool-1-thread-2] DEBUG o.r.barge.state.ReplicaManager - {http://localhost:56791/ 11 LEADER} - Response from http://localhost:56790/ Status true nextIndex 3, matchIndex 3
12:00:22.898 [pool-1-thread-2] DEBUG o.r.barge.state.ReplicaManager - {http://localhost:56791/ 11 LEADER} - Response from http://localhost:56790/ Status true nextIndex 3, matchIndex 3
12:00:22.898 [pool-1-thread-2] DEBUG org.robotninjas.barge.state.Leader - {http://localhost:56791/ 11 LEADER} - updating commitIndex to 3; sorted is [3, 3, 3]
I suspect the error lies in the way we update the leader's indices but I am still investigating and this is not clear to me (yet).