redpanda-operator
redpanda-operator copied to clipboard
Orchestrate scaling down Redpanda resource
If user would scale down more than (N/2 + 1) where N is the replication factor, then Redpanda will lost Raft quorum and it will be unable to serve any decommission request. Operator should handle this gracefully by scaling down using (N/2+1) formula and wait for the full decommission of old nodes.
JIRA Link: K8S-197
A few comments:
- Do we need to store somewhere the original size of the cluster. Cluster info can show you which nodes are down, and can be used to keep track of scaling down both the replicas and the number of nodes needed in a quorum right? I think this is where Redpanda is a bit special since you can decommission nodes, effectively changing this formula. So, what do we mean by scaling down here? Do we mean literally just scaling but not necessarily changing the number of nodes commissioned?
- Are we ok with using validatingWebhooks? This would still require knowing how many nodes are comissioned.
Scaling here would be changing the number of active brokers in the redpanda cluster.
IMO we shouldn't need to keep track of the original. We can always measure the active number of brokers with RPK or kubectl queries. Once the Spec has been updated, the operator should only focus on reconciling that. Rollbacks would need a lot more work.
Are we ok with using validatingWebhooks? This would still require knowing how many nodes are comissioned.
I'm a bit split on this one. I like not duplicating logic when possible. We could instead rely on redpanda to not decommission nodes that it can't and instead push the responsibility back onto the users?
One more thing, quorum is lost if there is less than (N+1)/2 nodes, and we can tolerate at most up to (N-1)/2 failures. Which means if we have (N+1)/2 failures then we have lost quorum and we can no longer read write.
That said, we lose quorum if we replicate down to (N+1)/2 -1 = (N-1)/2 nodes. This i what I will be using.
NIT
That said, we lose quorum if we replicate down to (N+1)/2 -1 = (N-1)/2 nodes.
We can afford to lose Flor( (N-1)/2 )
Other than that I agree.
Scope has changed a bit, now we want to also mantain quorum of topic partitions.
Addressed in https://github.com/redpanda-data/redpanda-operator/pull/102
For this ticket, what we will do for now is adding the quorum validation check we have in the current PR. I will create a new ticket discussing the issue we should be fixing which is scaling down in a controlled fashion. We should probably do this once we move away from flux.
After some testing, i am checking to see if there is a quick and simple win where we cannot scale below the min replication factor.