redpanda-operator add validation code to check replicas for quorum loss

Checking if quorum will be lost if replica count changes.

TODO:

test (kuttl test on a few well known values for replicas.)

https://github.com/orgs/redpanda-data/projects/54/views/8?pane=issue&itemId=56191515

Mar 26 '24 14:03 alejandroEsc

Could we instead be relying on redpanda to apply back pressure to rpk decommission? IIUC, the we'd need to be worried about more than just quorum but also replication counts. If you have a cluster of 5 brokers and set your topics to be replicated 5 times, you wouldn't be able to scale in to 4 but this check would still pass. I am assuming that the decommission command will refuse to decommission a node if it would risk losing quorum 😅

Mar 26 '24 15:03 chrisseto

Could we instead be relying on redpanda to apply back pressure to rpk decommission? IIUC, the we'd need to be worried about more than just quorum but also replication counts. If you have a cluster of 5 brokers and set your topics to be replicated 5 times, you wouldn't be able to scale in to 4 but this check would still pass. I am assuming that the decommission command will refuse to decommission a node if it would risk losing quorum 😅

Oh interesting, that sounds like another problem, one associated with replication of partitions of topics. My understanding of the problem was loosing quorum of the admin API. But this also sounds legit, though i think raft is also implemented on the partitions so you could replicate to 5 and scale down to 4 and still have quorum over your partitions. I suggest that if we want to also account for replication of topics then we should consider that another check to avoid conflating the two issues. The latter check probably belongs to the decommission controller though, since there you have visibility over the scaling.

Note that the decommission controller is a clean up process, it does not stop you from scaling redpanda but cleans up redpanda since it decommissions for you brokers that have already be turned off by the scaling. Here, the best we can do for partitions is try to ensure that quorum can be recovered based on desired partition replicas (largest of the min required for quorum wins), essentially avoid shooting yourself on the foot.

Mar 26 '24 15:03 alejandroEsc

Gotcha! As long as the decommission controller is a clean up process, we'll have to lean on validation. Ideally, we'd be able to run decommission on however many nodes ahead of changing the StatefulSet's number of replicas. My fear with validation is that we'll accidentally lock ourselves out of some operations in a disaster recovery scenario. We can always add in a backdoor in the future if need be though.

Mar 26 '24 17:03 chrisseto

I agree with @chrisseto. Decommission process is the right place to give us signal from core Redpanda that the scaling down is not possible. In operator v1 logic the decommission is performed before scaling down, not after. That way we will never break producers that needs to satisfy topic replication factor.

Mar 27 '24 09:03 RafalKorepta

I agree with @chrisseto. Decommission process is the right place to give us signal from core Redpanda that the scaling down is not possible. In operator v1 logic the decommission is performed before scaling down, not after. That way we will never break producers that needs to satisfy topic replication factor.

That seems outside of the scope of what was said on the ticket. I can add the logic here if you want and have two places were we take care of quorum issues, no problem. The first part, here already, basically ensures that admin api does not loose quorum, and this was discussed in private conversation with @RafalKorepta. The second part will try to consider quorum based on partitions on topics, this will be part of the decommission controller, here the best we can do here is not decomission nodes if this will push any partition out of quorum, and we will re-queue instead, giving the user the opportunity to fix their issue.

Mar 27 '24 13:03 alejandroEsc

here the best we can do here is not decomission nodes if this will push any partition out of quorum, and we will re-queue instead, giving the user the opportunity to fix their issue.

I'm not sure I understand what you are trying to say. By "here" you mean Redpanda controller?

P.S. Could you add test for this feature?

Mar 27 '24 13:03 RafalKorepta

here the best we can do here is not decomission nodes if this will push any partition out of quorum, and we will re-queue instead, giving the user the opportunity to fix their issue.

I'm not sure I understand what you are trying to say. By "here" you mean Redpanda controller?

P.S. Could you add test for this feature?

Here: means this PR

for the PS read the TODO.

Mar 28 '24 14:03 alejandroEsc

The tests validate that Redpanda custom resource is working. I'm afraid that due to very simple values the post-install or post-upgrade helm jobs would not give us confidence that Redpanda cluster itself is working correctly. Redpanda controller is not verifying the health of the cluster. It only cares about helm chart install/upgrade success.

Mar 29 '24 08:03 RafalKorepta

In the buildkite CI I saw few errors:

internal/controller/redpanda/redpanda_controller.go:915:10: err113: do not define dynamic errors, use wrapped static errors instead: "fmt.Errorf(\"could not unmarshal values data to validate helmrelease\")" (goerr113)
		return fmt.Errorf("could not unmarshal values data to validate helmrelease")
		       ^
internal/controller/redpanda/redpanda_controller.go:933:10: err113: do not define dynamic errors, use wrapped static errors instead: "fmt.Errorf(\"requested replicas of %d is less than %d neeed to maintain quorum\", requestedReplicas, minForQuorum)" (goerr113)
		return fmt.Errorf("requested replicas of %d is less than %d neeed to maintain quorum", requestedReplicas, minForQuorum)
		       ^
internal/controller/redpanda/redpanda_controller.go:41:2: SA1019: "k8s.io/utils/pointer" is deprecated: Use functions in k8s.io/utils/ptr instead: ptr.To to obtain a pointer, ptr.Deref to dereference a pointer, ptr.Equal to compare dereferenced pointers. (staticcheck)
	"k8s.io/utils/pointer"
	^

https://buildkite.com/redpanda/redpanda-operator/builds/1069#018e8043-2481-46ff-9571-76927c63d83b/242-309

    logger.go:42: 15:08:01 | disable-helm-controllers/2-create-redpanda | test step failed 2-create-redpanda

https://buildkite.com/redpanda/redpanda-operator/builds/1069#018e8043-2484-47e9-b811-79da68391839/240-896

K8S Operator v2 Helm test suite is failing all over the places as it can not install Redpanda helm chart.

Mar 29 '24 08:03 RafalKorepta

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

May 20 '24 18:05 CLAassistant

This PR is stale because it has been open 5 days with no activity. Remove stale label or comment or this will be closed in 5 days.

Mar 21 '25 15:03 github-actions[bot]

This PR was closed because it has been stalled for 5 days with no activity.

Mar 28 '25 02:03 github-actions[bot]

redpanda-operator redpanda-operator copied to clipboard

add validation code to check replicas for quorum loss

redpanda-operator
redpanda-operator copied to clipboard