backend.ai
backend.ai copied to clipboard
feat: Implement `Raft` consensus algorithm for distributed managers
This PR is related to lablup/backend.ai#415.
Using Raft
algorithm, a cluster cannot proceed anymore when the majority of managers are malfunctioning, because it lacks the quorum in election. However, in our case, a cluster should keep working on leader election and log replication even in such condition referred above. Therefore, we should consider a way to use both quorum
and majority
methods. (Considering fault-tolerance, maybe we can think of rank-based method.)
Migrated to #697