microcluster
microcluster copied to clipboard
fix: add consistency checks across core_cluster_members, truststore, and dqlite
Problem
Microcluster can enter inconsistent states where core_cluster_members (database), truststore, and dqlite cluster configuration become out of sync during partial failures. This leads to failed operations and difficult recovery scenarios.
Solution
Implements membership consistency validation before critical operations:
Validates before operations: Checks all three sources match before joins, removals, and token generation Clear error messages: Shows differences between sources when inconsistencies detected making it possible for admins to recover their cluster before worse things happen.
Changes Made
- state.go - Core consistency checking logic with CheckMembershipConsistency()
- cluster.go - Added checks before join/remove operations
- tokens.go - Added checks before token generation
- main.sh - Integration test simulating inconsistent state and verifying blocked operations
Testing
./example/test/main.sh membership # Test membership consistency