raft-rs icon indicating copy to clipboard operation
raft-rs copied to clipboard

Unexpected FinalizeMembershipChange entries in commited entries after finalize

Open haraldng opened this issue 4 years ago • 2 comments

I assume from the documentation that begin_membership_change should be called once by the leader and finalize_membership_change should be called once by all nodes. Please let me know if that is not the intended usage :)

Following the documentation of membership_change and example of five_mem_node I am doing the following:

// in function on_ready()
        if let Some(committed_entries) = ready.committed_entries.take() {
            for entry in &committed_entries {
                if entry.data.is_empty() {
                    // From new elected leaders.
                    continue;
                }
                if let EntryType::EntryConfChange = entry.get_entry_type() {
                    // For conf change messages, make them effective.
                    let mut cc = ConfChange::default();
                    cc.merge_from_bytes(&entry.data).unwrap();
                    let change_type = cc.get_change_type();
                    match &change_type {
                        ConfChangeType::BeginMembershipChange => {
                            let reconfig = cc.get_configuration();
                            let start_index = cc.get_start_index();
                            raft_node
                                .raft
                                .begin_membership_change(&cc)
                                .expect("Failed to begin reconfiguration");

                            assert!(raft_node.raft.is_in_membership_change());
                        }
                        ConfChangeType::FinalizeMembershipChange => {
                                raft_node
                                    .raft
                                    .finalize_membership_change(&cc)
                                    .expect("Failed to finalize reconfiguration");
                        }
          }

The function finalize_membership_change is successful the first time but then the FinalizeMembershipChange entries reouccur in the commited entries which causes the program to panic as next_configuration is None. FinalizeMembershipChange entries occur both multiple times in ready.commited_entries and later when on_ready() is called again.

haraldng avatar Mar 16 '20 21:03 haraldng

@haraldng thanks for your report. It's expected that FinalizeMembershipChange occurs multi times, so you needs to handle the error returned by finalize_membership_change by yourself. The reason is FinalizeMembershipChange is appended in commit_apply, so if a peer restarts with such an entry which is not applied, it could append it again.

BTW it's better to not use the joint-consensus implemantation. For now just use normal configuration change is ok. Later we will change the implementation to etcd/raft's style. Please take a look at https://github.com/tikv/raft-rs/pull/317. Would you like to contribute to this?

hicqu avatar Mar 17 '20 04:03 hicqu

@hicqu Thank you for your reply. If I am using AddNode and RemoveNode variants instead, will that be batched as one configuration change? Or will they be handled separately as two configuration changes? How does it affect the performance?

haraldng avatar Mar 17 '20 12:03 haraldng