koperator icon indicating copy to clipboard operation
koperator copied to clipboard

Adding and removing brokers with the same event leads to misbehaving Operator

Open baluchicken opened this issue 3 years ago • 0 comments

Describe the bug Adding and removing brokers in the same update triggers broker operations in a wrong order in Koperator where the broker removal handled with higher priority as adding new brokers to the Kafka cluster. There are multiple issues with this behavior:

  • broker removal running into an error might block adding new broker to the cluster
  • there is a chance that Koperator terminates the broker pod marked for removal while it still has partition replicas as Cruise Control might consider it as an eligible broker for hosting replicas if the broker pod is not terminated until Cruise Control performs self-healing.

Steps to reproduce the issue:

  • create a KafkaCluster using SimpleKafkaCluster
  • update KafkaCluster CR by replacing the ID for broker 2 with 3

Expected behavior Adding new broker to the cluster should have higher priority then broker removal. Koperator must ensure that new brokers are added to the cluster before removing the brokers marked for removal.

baluchicken avatar Jan 24 '22 16:01 baluchicken