MINOR: Document kRaft voter auto join will add a removed voter immediately issue
Document auto join behaviour in upgrade.html
@ppatierno @kevin-wu24 , FYI
Thanks for the PR @TaiJuWu .
This solution has the unfortunate side effect that nodes cannot leave and join the quorum. The user cannot do the following operations:
1. First add voter A. 2. Remove voter A, on purpose or by accident. 3. Add voter A again. This should succeed but it will fail with this PR.
@jsancio , basically, the current solution only block the auto-join add voter request if it's not the 1st time joined the voters. So in your operation example, the step (3) will succeed if it's manual join (with AckWhenCommitted=true in KIP-1186). And if in step (3), it's auto-join (with AckWhenCommitted=false), it will be rejected.
Thanks for the PR @TaiJuWu .
This solution has the unfortunate side effect that nodes cannot leave and join the quorum. The user cannot do the following operations:
- First add voter A.
- Remove voter A, on purpose or by accident.
- Add voter A again. This should succeed but it will fail with this PR.
By the way, our integration test also cover such case, it test following:
- observers can join cluster with auto-join
- it be removed by manual and does not send addVoterRequest twice again
- admin can add this node to the cluster.
@TaiJuWu @jsancio @kevin-wu24 , I found the current solution has one drawback, which is that the snapshot only contains the latest voters at the time the snapshot taken. That means we lost the voters change history before the snapshot. This could cause the issue:
- controller 1, 2, 3 are the voters
- controller 3 is removed, so the current voters are [1, 2]
- controller 3 cannot be auto-joined because the leader controller found 3 has ever been in the voter.
- The leader controller takes a snapshot for metadata log
- The leader controller restarts, and load the snapshot + records after snapshot, which gets the voter history contains only [1, 2]
- controller 3 restarts, addVoterRequest is sent because auto-join is enabled, but this time it will succeed because the leader controller doesn't know controller 3 has ever been in the voters.
My suggestion:
- Keep the same semantic:
new controllers (node id + directory UUID tuples) will automatically join the KRaft voter set once if they have not been a voter before., but add a note in doc to notify users to shutdown the removed controller soon to avoid the node being auto-joined again in some edge cases. - Change the semantic to:
Only do the auto-join when a node is start up first time. The start up first time is defined by: LEO of metadata_log is 0. The use case of auto-join is original designed for new formatted controller node, so it should not be a problem. The drawback of this issue is that the auto-join can only have one shot if the first auto-join failed and restarted. For example: a. controller A is formatted, with auto-join enabled b. controller A startup, addVoterRequest is sent (We can initialize the updateVoter timer as expired state, and reset after response received) c. The leader controller responded with error d. controller A fetches metadata log from the leader controller (metadata log LEO > 0) e. Before controller A retrying the addVoter request, it crashes f. After restart, controller A will not auto send addVoter request anymore due to (metadata log LEO > 0)
I think the suggestion (2) is better since it has no edge cases at all. We just need to make it clear that the auto-join can only happen when a node has empty metadata log. WDYT? (What a complicated issue!)
@showuon Thanks for your sharing.
Do we consider to apply the other solution which is adding a configuration for auto-join interval? We can make it a private at 4.2 and public at 4.3. WDYT?
The benefit of this solution is that users know the node always add to the cluster.
Thanks for the feature @TaiJuWu but I don't think we should change the semantic of the KRaft protocol without a KIP.
I understand that the auto join feature doesn't meet your requirements. Have you considered not using the auto join feature and instead implementing controller addition and removal in the orchestration layer using Admin:addRaftVoter, Admin::removeRaftVoter and Admin::describeMetadataQuorum?
Thanks for the comments @jsancio ! I agree if we can't come up a good solution for this issue, it's also fine to keep what we have so far. But, I think we should at least improve the documentation and mention about this known UX issue. @TaiJuWu , could you help on this? Thanks.
Thanks for the feature @TaiJuWu but I don't think we should change the semantic of the KRaft protocol without a KIP. I understand that the auto join feature doesn't meet your requirements. Have you considered not using the auto join feature and instead implementing controller addition and removal in the orchestration layer using Admin:addRaftVoter, Admin::removeRaftVoter and Admin::describeMetadataQuorum?
Thanks for the comments @jsancio ! I agree if we can't come up a good solution for this issue, it's also fine to keep what we have so far. But, I think we should at least improve the documentation and mention about this known UX issue. @TaiJuWu , could you help on this? Thanks.
Sure. Will update this PR for documentation only.
@showuon @jsancio @kevin-wu24 , I updated this PR to document this ticket only, PTAL.