strimzi-kafka-operator icon indicating copy to clipboard operation
strimzi-kafka-operator copied to clipboard

MM2 PDB maxUnavailable is one when scaling down to zero

Open fvaleri opened this issue 3 years ago • 2 comments
trafficstars

When scaling MM2 down to zero, its PDB has fixed maxUnavailable: 1 and this causes warnings when draining the K8s node.

# before
$ kubectl get po
NAME                                              READY   STATUS    RESTARTS   AGE
my-cluster-tgt-entity-operator-5949f65c7d-mg4nw   3/3     Running   0          26m
my-cluster-tgt-kafka-0                            1/1     Running   0          28m
my-cluster-tgt-kafka-1                            1/1     Running   0          28m
my-cluster-tgt-kafka-2                            1/1     Running   0          28m
my-cluster-tgt-zookeeper-0                        1/1     Running   0          29m
my-cluster-tgt-zookeeper-1                        1/1     Running   0          29m
my-cluster-tgt-zookeeper-2                        1/1     Running   0          29m
my-mm2-mirrormaker2-7485ccd555-6862x              1/1     Running   0          2m13s
my-mm2-mirrormaker2-7485ccd555-cdgsz              1/1     Running   0          2m13s
$ kubectl get pdb
NAME                       MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS   AGE
my-cluster-tgt-kafka       N/A             1                 1                     28m
my-cluster-tgt-zookeeper   N/A             1                 1                     30m
my-mm2-mirrormaker2        N/A             1                 1                     12m

# patching
$ kubectl patch kmm2 my-mm2 --type json -p '
  [{
    "op":"replace",
    "path":"/spec/replicas",
    "value":0
  }]'
kafkamirrormaker2.kafka.strimzi.io/my-mm2 patched

# after
$ kubectl get po
NAME                                              READY   STATUS    RESTARTS   AGE
my-cluster-tgt-entity-operator-5949f65c7d-mg4nw   3/3     Running   0          27m
my-cluster-tgt-kafka-0                            1/1     Running   0          29m
my-cluster-tgt-kafka-1                            1/1     Running   0          29m
my-cluster-tgt-kafka-2                            1/1     Running   0          29m
my-cluster-tgt-zookeeper-0                        1/1     Running   0          30m
my-cluster-tgt-zookeeper-1                        1/1     Running   0          30m
my-cluster-tgt-zookeeper-2                        1/1     Running   0          30m
$ kubectl get pdb
NAME                       MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS   AGE
my-cluster-tgt-kafka       N/A             1                 1                     29m
my-cluster-tgt-zookeeper   N/A             1                 1                     30m
my-mm2-mirrormaker2        N/A             1                 0                     12m

During the reconciliation, we should set maxUnavailable: <MAX_INT> in that case, and back to maxUnavailable: 1 when scaling up again.

The workaround is to use templates, but one has to remember to change back to 1:

spec:
  template:
    podDisruptionBudget:
      maxUnavailable: <MAX_INT>

fvaleri avatar Jan 11 '22 12:01 fvaleri

I'm not sure I follow the problem. Does it not scale to 0 because of the PDB? Or what issue does it actually cause? I think managing the PDB might have been one of our mistakes. I'm not sure we want to add more complexity into it unless absolutely necessary.

scholzj avatar Jan 11 '22 12:01 scholzj

Nope. It scales, but given that PDB.maxUnavailable remains fixed at 1, you get warnings when draining the K8s node. On OpenShift the wrong PDB state causes cluster alerts at warning level, so they are not critical (nobody should be paged on them).

The use case behind the scaling to zero is in DR scenarios, where you may want to have a ready-to-go MM2 in the failover cluster, that you can scale up to mirror back data once the primary cluster has been restored. You can achieve the same having a MM2 YAML file ready to be applied when needed, but this does not work when you have a GitOps approach to deployment (the workaround mentioned before can help in that case).

In the end, the current behavior seems to be wrong and should be eventually fixed.

WDYT?

fvaleri avatar Jan 11 '22 12:01 fvaleri

Triaged on 18.8.2022: The PDB can be configured through the template section and it can be disabled from there. This should be closed.

scholzj avatar Aug 18 '22 14:08 scholzj