strimzi-kafka-operator
strimzi-kafka-operator copied to clipboard
[Enhancement] Support specifying number of maximum unavailable pods as percentage and not only as ablsolute value
Describe the bug podDisruptionBudget maxUnavailable errors when set to a percent.
To Reproduce Steps to reproduce the behavior: Set maxUnavailable on the podDisruptionBudget to a percent instead of a number
Errors out with
spec.kafka.template.podDisruptionBudget.maxUnavailable in body must be of type integer: "string"
Expected behavior Kafka supports maxUnavailable as a percent. The operator should not error when deploying as a percent.
Environment (please complete the following information):
- Strimzi version: 0.17.0
- Installation method: YAML Spec
- Kubernetes cluster: v1.15.5
- Infrastructure: Corp Private
YAML files and logs
Invalid value: map[string]interface {}{"apiVersion":"kafka.strimzi.io/v1beta1", "kind":"Kafka", "metadata":map[string]interface {}{"creationTimestamp":"2020-05-19T19:29:34Z", "$ spec.kafka.template.podDisruptionBudget.maxUnavailable in body must be of type integer: "string"
I think we currently support only absolute number of pods. I will change this to enhancement to support also percentages.
I think we currently support only absolute number of pods. I will change this to enhancement to support also percentages.
Thank you!
Triaged on 31.3.2022: It would need to be investigated if the percentage works with custom controllers / Strimzi Pod Sets. Otherwise this can be implemented.
Hi, I would love to look into this. Can I please be assigned to this?
You can look into it. But I guess you would need to start with the investigation if it works with StrimziPodSets.
I have verified that I am able to see the same error with this config -
apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
name: my-cluster
spec:
kafka:
version: 3.5.1
replicas: 4
listeners:
- name: plain
port: 9092
type: internal
tls: false
- name: tls
port: 9093
type: internal
tls: true
template:
podDisruptionBudget:
maxUnavailable: "35%"
config:
offsets.topic.replication.factor: 3
transaction.state.log.replication.factor: 3
transaction.state.log.min.isr: 2
default.replication.factor: 3
min.insync.replicas: 2
inter.broker.protocol.version: "3.5"
storage:
type: ephemeral
zookeeper:
replicas: 1
storage:
type: ephemeral
entityOperator:
topicOperator: {}
userOperator: {}
The Kafka "my-cluster" is invalid: spec.kafka.template.podDisruptionBudget.maxUnavailable: Invalid value: "string": spec.kafka.template.podDisruptionBudget.maxUnavailable in body must be of type integer: "string"
Is this what needed to be investigated?
Not really. You would need to change the type of the field I guess and add some validation. The thing that needs to be investigated is whether when you set it to 35%
in the actual PodDisruption budget, whether Kubernetes is able to handle it.
PS: I can try to do the investigation. But not sure when I get to it. So if you want, feel free to pick some other issue until then.
Okay sure, I'll just share some of my other findings with you in the sequence I performed the following with kubectl apply/replace
.
with maxUnavailable: "35%"
It errored out and did not create the PodDisruptionBudget.
with maxUnavailable: 2
It worked and created the PodDisruptionBudget. with minAvailable: 2
After that when I tried to edit the config and put maxUnavailable: "35%"
It errored out and did not update the PodDisruptionBudget.
Question - I wanted to try and see if the PodDisruptionBudget is respected when I try to delete the pods. I think one way to do it is to drain node(s) to check if you get any errors. This works only if you have a multi node cluster. I am setup with a single node kubernetes cluster using docker desktop. How do I check if the PodDisruptionBudget works in such a setting?
I don't think you can do it easily. I would normally do it with multi-node cluster and node draining.
I investigated if the percentage can be used with StrimziPodSets. It cannot, because it requires the controller resource to have scale subresource and StrimziPodSets do not have it (and cannot support it given the nature of their design).
Without the scale subresource, Kube will complain:
- With Events:
0s Warning CalculateExpectedPodCountFailed poddisruptionbudget/my-cluster-kafka controllermanager Failed to calculate the number of expected pods: strimzipodsets.core.strimzi.io does not implement the scale subresource 2m4s 4 my-cluster-kafka.1786f3ab7bf3fc3d
- In PDB status:
status: conditions: - lastTransitionTime: "2023-09-21T15:31:56Z" message: strimzipodsets.core.strimzi.io does not implement the scale subresource observedGeneration: 5 reason: SyncFailed status: "False" type: DisruptionAllowed currentHealthy: 3 desiredHealthy: 3 disruptionsAllowed: 0 expectedPods: 3 observedGeneration: 5
This issue should be closed and not applicable.