libraft
libraft copied to clipboard
Duplicate heartbeats may prevent a Raft follower from timing out a Raft leader
If you have a really poor network that duplicates a lot of packets (and specifically, heartbeat packets) it's possible for a follower to believe that it's still in communication with a leader. This is because a heartbeat packet does not contain any information that would move time forward. This could mean that a leader failure goes undetected, and could prevent the Raft cluster from making progress.
This is highly unlikely in practice. It's much more likely that the network will drop packets, not duplicate them. Moreover, even a few duplicates don't matter: what matters is that the duplicates continue, which is unlikely.
That said, this should be mitigated. One solution would be to use periodic NOOPs instead of heartbeats to verify that the leader is still alive.