kafka-operator
kafka-operator copied to clipboard
Handle redeployments
Affected version
kafka-operator 0.5.0 and zk-operator 0.9.0
Current and expected behavior
If i apply, delete and apply the following stackable crds the kafka cluster works for the first apply but not anymore for the second time.
---
apiVersion: zookeeper.stackable.tech/v1alpha1
kind: ZookeeperCluster
metadata:
name: simple-zk
spec:
version: 3.8.0
servers:
roleGroups:
default:
replicas: 3
config: {}
---
apiVersion: zookeeper.stackable.tech/v1alpha1
kind: ZookeeperZnode
metadata:
name: simple-kafka-znode
spec:
clusterRef:
name: simple-zk
namespace: default
---
apiVersion: kafka.stackable.tech/v1alpha1
kind: KafkaCluster
metadata:
name: simple-kafka
spec:
version: 3.1.0
zookeeperConfigMapName: simple-kafka-znode
brokers:
roleGroups:
default:
replicas: 3
In the logs i can find the following error message:
[2022-04-26 14:21:18,645] ERROR Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
kafka.common.InconsistentClusterIdException: The Cluster ID dOMtDqQ_QU6rqOpOeyosIA doesn't match stored clusterId Some(7tfltX7ATz-aIURko5dtnQ) in meta.properties. The broker is trying to join the wrong cluster. Configured zookeeper.connect may be wrong.
at kafka.server.KafkaServer.startup(KafkaServer.scala:228)
at kafka.Kafka$.main(Kafka.scala:109)
at kafka.Kafka.main(Kafka.scala)
Problem is that the ZKCluster ID gets saved in the first apply. Since the volumes are persistent over kubectl delete -f kafka.yaml
and the ZKCluster generates a new ID on the redeploy the Kafka cluster is stuck.
Possible solution
I am wondering why ZK gets a new ClusterID on every restart. Shouldnt the ID be fixed since the data inside the Cluster doesn't change (persistent volumes)? If the id change is inevitable the Kafka cluster should tolerate the ID change of the ZKCluster.
Additional context
No response
Environment
Client Version: v1.23.6 Server Version: v1.22.6
Would you like to work on fixing this bug?
yes