strimzi-kafka-operator icon indicating copy to clipboard operation
strimzi-kafka-operator copied to clipboard

[Enhancement] Support specifying number of maximum unavailable pods as percentage and not only as ablsolute value

Open dwhipstock opened this issue 4 years ago • 3 comments

Describe the bug podDisruptionBudget maxUnavailable errors when set to a percent.

To Reproduce Steps to reproduce the behavior: Set maxUnavailable on the podDisruptionBudget to a percent instead of a number

Errors out with

spec.kafka.template.podDisruptionBudget.maxUnavailable in body must be of type integer: "string"

Expected behavior Kafka supports maxUnavailable as a percent. The operator should not error when deploying as a percent.

Environment (please complete the following information):

  • Strimzi version: 0.17.0
  • Installation method: YAML Spec
  • Kubernetes cluster: v1.15.5
  • Infrastructure: Corp Private

YAML files and logs

Invalid value: map[string]interface {}{"apiVersion":"kafka.strimzi.io/v1beta1", "kind":"Kafka", "metadata":map[string]interface {}{"creationTimestamp":"2020-05-19T19:29:34Z", "$ spec.kafka.template.podDisruptionBudget.maxUnavailable in body must be of type integer: "string"

dwhipstock avatar May 29 '20 17:05 dwhipstock

I think we currently support only absolute number of pods. I will change this to enhancement to support also percentages.

scholzj avatar May 29 '20 17:05 scholzj

I think we currently support only absolute number of pods. I will change this to enhancement to support also percentages.

Thank you!

dwhipstock avatar May 29 '20 17:05 dwhipstock

Triaged on 31.3.2022: It would need to be investigated if the percentage works with custom controllers / Strimzi Pod Sets. Otherwise this can be implemented.

scholzj avatar Mar 31 '22 14:03 scholzj

Hi, I would love to look into this. Can I please be assigned to this?

pillai-ashwin avatar Sep 06 '23 05:09 pillai-ashwin

You can look into it. But I guess you would need to start with the investigation if it works with StrimziPodSets.

scholzj avatar Sep 06 '23 17:09 scholzj

I have verified that I am able to see the same error with this config -

apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
  name: my-cluster
spec:
  kafka:
    version: 3.5.1
    replicas: 4
    listeners:
      - name: plain
        port: 9092
        type: internal
        tls: false
      - name: tls
        port: 9093
        type: internal
        tls: true
    template:
      podDisruptionBudget:
        maxUnavailable: "35%"
    config:
      offsets.topic.replication.factor: 3
      transaction.state.log.replication.factor: 3
      transaction.state.log.min.isr: 2
      default.replication.factor: 3
      min.insync.replicas: 2
      inter.broker.protocol.version: "3.5"
    storage:
      type: ephemeral
  zookeeper:
    replicas: 1
    storage:
      type: ephemeral
  entityOperator:
    topicOperator: {}
    userOperator: {}
The Kafka "my-cluster" is invalid: spec.kafka.template.podDisruptionBudget.maxUnavailable: Invalid value: "string": spec.kafka.template.podDisruptionBudget.maxUnavailable in body must be of type integer: "string"

Is this what needed to be investigated?

pillai-ashwin avatar Sep 08 '23 14:09 pillai-ashwin

Not really. You would need to change the type of the field I guess and add some validation. The thing that needs to be investigated is whether when you set it to 35% in the actual PodDisruption budget, whether Kubernetes is able to handle it.

scholzj avatar Sep 08 '23 16:09 scholzj

PS: I can try to do the investigation. But not sure when I get to it. So if you want, feel free to pick some other issue until then.

scholzj avatar Sep 08 '23 16:09 scholzj

Okay sure, I'll just share some of my other findings with you in the sequence I performed the following with kubectl apply/replace .

with maxUnavailable: "35%" It errored out and did not create the PodDisruptionBudget. with maxUnavailable: 2 It worked and created the PodDisruptionBudget. with minAvailable: 2 After that when I tried to edit the config and put maxUnavailable: "35%" It errored out and did not update the PodDisruptionBudget.

Question - I wanted to try and see if the PodDisruptionBudget is respected when I try to delete the pods. I think one way to do it is to drain node(s) to check if you get any errors. This works only if you have a multi node cluster. I am setup with a single node kubernetes cluster using docker desktop. How do I check if the PodDisruptionBudget works in such a setting?

pillai-ashwin avatar Sep 08 '23 17:09 pillai-ashwin

I don't think you can do it easily. I would normally do it with multi-node cluster and node draining.

scholzj avatar Sep 08 '23 17:09 scholzj

I investigated if the percentage can be used with StrimziPodSets. It cannot, because it requires the controller resource to have scale subresource and StrimziPodSets do not have it (and cannot support it given the nature of their design).

Without the scale subresource, Kube will complain:

  • With Events:
    0s          Warning   CalculateExpectedPodCountFailed   poddisruptionbudget/my-cluster-kafka                                                             controllermanager                                                  Failed to calculate the number of expected pods: strimzipodsets.core.strimzi.io does not implement the scale subresource                                                                                                                           2m4s         4       my-cluster-kafka.1786f3ab7bf3fc3d
    
  • In PDB status:
    status:
      conditions:
      - lastTransitionTime: "2023-09-21T15:31:56Z"
        message: strimzipodsets.core.strimzi.io does not implement the scale subresource
        observedGeneration: 5
        reason: SyncFailed
        status: "False"
        type: DisruptionAllowed
      currentHealthy: 3
      desiredHealthy: 3
      disruptionsAllowed: 0
      expectedPods: 3
      observedGeneration: 5
    

This issue should be closed and not applicable.

scholzj avatar Sep 21 '23 15:09 scholzj