zookeeper
zookeeper copied to clipboard
Fix haveDelivered wrong implementation.
Original implementation and mine
- The original implementation of
QuorumCnxManager::haveDelivered
returns true once it finds a queue is empty, and returns false if all queues aren't empty. - My implementation returns false once it finds a queue not empty, and returns true if all queues are emtpy.
Reasons
There are 2 reasons why I think the implementation of QuorumCnxManager::haveDelivered
is wrong.
1. comment mismatch
The comment of QuorumCnxManager::haveDelivered
says:
/** * Check if all queues are empty, indicating that all messages have been delivered. */
As the comment says, the method should return true if all queues are empty, But the implementation returns true if one of the queues is empty.
2. The methods functionality
In FastLeaderElection::lookForLeader
:
if(manager.haveDelivered()){
sendNotifications();
} else {
manager.connectAll();
}
I believe the code wants to do the following:
- If The manager has sent all messages(but receives nothing), send notifications again.
- Otherwise, if some queue of
queueSendMap
isn't empty, the connection to that server must have broken down or haven't been established.
Therefore, manager.haveDelivered()
should check whether some queue isn't empty.
(Note that manager.connectAll
only tries connecting all servers. It skips the servers that "I"(the server itself) have connected to.)
Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build-maven/1690/
Build result: FAILURE
[...truncated 905.57 KB...][JENKINS] Archiving /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build-maven/pom.xml to org.apache.zookeeper/parent/3.6.0-SNAPSHOT/parent-3.6.0-SNAPSHOT.pom[JENKINS] Archiving /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build-maven/zookeeper-recipes/zookeeper-recipes-lock/pom.xml to org.apache.zookeeper/zookeeper-recipes-lock/3.6.0-SNAPSHOT/zookeeper-recipes-lock-3.6.0-SNAPSHOT.pom[JENKINS] Archiving /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build-maven/zookeeper-contrib/pom.xml to org.apache.zookeeper/zookeeper-contrib/3.6.0-SNAPSHOT/zookeeper-contrib-3.6.0-SNAPSHOT.pom[JENKINS] Archiving /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build-maven/zookeeper-client/pom.xml to org.apache.zookeeper/zookeeper-client/3.6.0-SNAPSHOT/zookeeper-client-3.6.0-SNAPSHOT.pom[JENKINS] Archiving /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build-maven/zookeeper-jute/pom.xml to org.apache.zookeeper/zookeeper-jute/3.6.0-SNAPSHOT/zookeeper-jute-3.6.0-SNAPSHOT.pom[JENKINS] Archiving /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build-maven/zookeeper-jute/target/zookeeper-jute-3.6.0-SNAPSHOT.jar to org.apache.zookeeper/zookeeper-jute/3.6.0-SNAPSHOT/zookeeper-jute-3.6.0-SNAPSHOT.jar[JENKINS] Archiving /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build-maven/zookeeper-jute/target/zookeeper-jute-3.6.0-SNAPSHOT-tests.jar to org.apache.zookeeper/zookeeper-jute/3.6.0-SNAPSHOT/zookeeper-jute-3.6.0-SNAPSHOT-tests.jar[JENKINS] Archiving /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build-maven/zookeeper-jute/target/zookeeper-jute-3.6.0-SNAPSHOT-sources.jar to org.apache.zookeeper/zookeeper-jute/3.6.0-SNAPSHOT/zookeeper-jute-3.6.0-SNAPSHOT-sources.jar[JENKINS] Archiving /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build-maven/zookeeper-jute/target/zookeeper-jute-3.6.0-SNAPSHOT-javadoc.jar to org.apache.zookeeper/zookeeper-jute/3.6.0-SNAPSHOT/zookeeper-jute-3.6.0-SNAPSHOT-javadoc.jar[JENKINS] Archiving /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build-maven/zookeeper-recipes/zookeeper-recipes-queue/pom.xml to org.apache.zookeeper/zookeeper-recipes-queue/3.6.0-SNAPSHOT/zookeeper-recipes-queue-3.6.0-SNAPSHOT.pom[JENKINS] Archiving /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build-maven/zookeeper-docs/pom.xml to org.apache.zookeeper/zookeeper-docs/3.6.0-SNAPSHOT/zookeeper-docs-3.6.0-SNAPSHOT.pom[JENKINS] Archiving /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build-maven/zookeeper-docs/target/zookeeper-docs-3.6.0-SNAPSHOT.jar to org.apache.zookeeper/zookeeper-docs/3.6.0-SNAPSHOT/zookeeper-docs-3.6.0-SNAPSHOT.jar[JENKINS] Archiving /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build-maven/zookeeper-docs/target/zookeeper-docs-3.6.0-SNAPSHOT-tests.jar to org.apache.zookeeper/zookeeper-docs/3.6.0-SNAPSHOT/zookeeper-docs-3.6.0-SNAPSHOT-tests.jar[JENKINS] Archiving /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build-maven/zookeeper-docs/target/zookeeper-docs-3.6.0-SNAPSHOT-sources.jar to org.apache.zookeeper/zookeeper-docs/3.6.0-SNAPSHOT/zookeeper-docs-3.6.0-SNAPSHOT-sources.jar[JENKINS] Archiving /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build-maven/zookeeper-recipes/pom.xml to org.apache.zookeeper/zookeeper-recipes/3.6.0-SNAPSHOT/zookeeper-recipes-3.6.0-SNAPSHOT.pom[JENKINS] Archiving /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build-maven/zookeeper-client/zookeeper-client-c/pom.xml to org.apache.zookeeper/zookeeper-client-c/3.6.0-SNAPSHOT/zookeeper-client-c-3.6.0-SNAPSHOT.pom[JENKINS] Archiving /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build-maven/zookeeper-contrib/zookeeper-contrib-rest/pom.xml to org.apache.zookeeper/zookeeper-contrib-rest/3.6.0-SNAPSHOT/zookeeper-contrib-rest-3.6.0-SNAPSHOT.pom[JENKINS] Archiving /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build-maven/zookeeper-assembly/pom.xml to org.apache.zookeeper/zookeeper-assembly/3.6.0-SNAPSHOT/zookeeper-assembly-3.6.0-SNAPSHOT.pom[JENKINS] Archiving /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build-maven/zookeeper-contrib/zookeeper-contrib-zooinspector/pom.xml to org.apache.zookeeper/zookeeper-contrib-zooinspector/3.6.0-SNAPSHOT/zookeeper-contrib-zooinspector-3.6.0-SNAPSHOT.pomchannel stopped[SpotBugs] Skipping execution of recorder since overall result is 'FAILURE'Setting status of 555a292de47ac1a54f01073e9a0fe5d8f68e7eb1 to FAILURE with url https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build-maven/1690/ and message: 'FAILURE 'Using context: JenkinsMaven
I still think there's sth wrong with the QuorumCnxManager.haveDelivered()
or
if(manager.haveDelivered()){
sendNotifications();
} else {
manager.connectAll();
}
Becase the old implementation won't try to "connect all" as long as the first "sending queue" is empty. But I don't know how to simulate the scene that will triger the problem.
Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build-maven/1691/
Is is possible that there are 4 servers (s1, s2, s3, s4),
- In the beginning, (s2, s3, s4) can establish links with s1, while the network among (s2, s3, s4) is broken and (s2, s3, s4) can't establish links to each other(that is, everyone can only connect to s1).
- Because the links between (s2, s1), (s3, s1) and (s4, s1) are healthy, and the first "sending queue" of s2, s3 or s4 is s1,
manager.haveDelivered()
will always betrue
. So (s2, s3, s4) won't callmanager.connectAll();
.- The first "sending queue" of s1 is towards s2, which is also healthy.
- Now the network among (s2, s3, s4) is healthy again, but is it possible that they won't try to connect to each other any more? And maybe the quorum will not be established?
Established or not?
- After many rounds of notification exchanging, everyone votes for s4, and from the perspective of s1, s4 will get 4 votes, so s1 will be a follower.
- and...? will the quorum be established in the end?
@KimRasak
-
The logic only affects the
queueSendMap
which some queue is empty and others is not empty. In this case, the origin logic enters intosendNotifications()
, your fix enters intoconnectAll()
. Look at the method:sendNotifications()
, it puts the notification intosendqueue
. , thenWorkerSender.run
poll the notification(QuorumCnxManager#toSend
), which also connects to other peers(connectOne(long sid)
) -
Now the network among (s2, s3, s4) is healthy again, but is it possible that they won't try to connect to each other any more? And maybe the quorum will not be established?
I guess not have that possibility. You can use some tool to reproduce some corner case you image(e.g blockade)
@KimRasak Is that a scenario that you mention in your latest comments is something that you've seen happened in a real cluster? I'm asking, because I'd like to get confidence in your change before accepting it. You're about to make changes in leader election, so we have to be very careful about that.
Regarding the failing scenario (I think it would be more realistic with odd number of participants, but doesn't really matter.) can we say that all participants will get the most recent notifications from s1, because they all have a healthy connection to it? In which case having live connection between all nodes might not be a requirement of a successful leader election.