NuRaft icon indicating copy to clipboard operation
NuRaft copied to clipboard

An example use case - Multiple raft cluster with in the server cluster

Open yashhema opened this issue 6 years ago • 7 comments

I asked the same question in cornerstone github. Just wanted to get your inputs, how can we implement a scenario Can you please provide your inputs - Consider I have 5 servers (S1,S2,S3,S4,S5) , and 7 tasks which are running on these servers. Each task needs 2 replicas. For task T1 - I can have S1,S2,S3 where S1 is the leader For task T2 - I can have S2,S1,S3 where S2 is the leader and so on. Its possible one of the server goes down, then I should be able to add one of the existing server like say if S1 goes down, I should be able to use S4 instead. Any guidelines or pointers will be really appreciated

yashhema avatar Oct 17 '19 16:10 yashhema

Hi @yashhema

Can you please elaborate more? Here is my understanding, please correct me If I misunderstood:

  • There are 5 servers, which will be shared for running 7 tasks.
  • Each task will pick 3 servers (out of 5) and organize a Raft group. Each group is independent each other, and will do quorum write (commit after being agreed by at least 2 replicas in the group).
  • A server can be members of different groups at a time.

If so, once one node goes down, you can contact the current leader for each group, remove (by calling remove_srv) the node currently offline, and then add a new node (by calling add_srv). The new node will automatically sync up with leader using snapshot or log. In the meantime (while you remove previous node and new node is catching up), existing groups will be still available as long as remaining 2 nodes are alive.

greensky00 avatar Oct 18 '19 03:10 greensky00

Hello, Yes , your understanding is correct. Main idea is that I should be running one exe per server and with in the same exe - I can manage multiple raft groups (using the same port , something like https://github.com/atomix/atomix) Any suggestion are welcome

yashhema avatar Oct 22 '19 19:10 yashhema

If that's the case, you can remove the problematic node and then add the other one. There will be automatic online sync-up once you add a node to existing running cluster. And you need to make sure that each Raft cluster (for each task) should have separate Raft log. Thanks.

greensky00 avatar Oct 26 '19 00:10 greensky00

Hi @greensky00 , thanks for your great work!

Does each Raft cluster need a separate nuraft::raft_server instance? If so, could they use the same port for each Raft cluster?

ZTJiu avatar May 27 '20 08:05 ZTJiu

@ZTJiu Yes, you need separate raft_server instances. And if they are running in the same process, they should use different listening ports.

greensky00 avatar May 27 '20 16:05 greensky00