strimzi-kafka-operator
strimzi-kafka-operator copied to clipboard
Protect KafkaTopic resorces from switching between different Kafka clusters using cluster or topic IDs
Bug Description
In a scenario where you have two Kafka clusters within the same namespace, both with the Entity operator running and synchronizing KafkaTopic
objects, the first cluster to spin up will "win" ownership of the KafkaTopic
k8s resource for the same topic. This results in three significant issues:
- Synchonization errors - While both clusters are running, the second cluster will throw
Error reconciling KafkaTopic
andio.fabric8.kubernetes.client.KubernetesClientException
errors due to attempting to create a new resource when one already exists. - Data loss conditions - If a topic with the same name exists in two different clusters, and the
KafkaTopic
, labelled as belonging to one cluster is deleted, the topic will be deleted from both clusters. - Data loss conditions - (basically the same situation as above) If the first cluster is deleted, the
KafkaTopic
resources remain in k8s, labelled as belonging to the now-gone cluster. If you delete these (which would appear to be a safe operation) the next synchronization of the Entity operator will delete those same topics in its cluster (even though it was never able to properly synchronize, as noted above).
The end result is that administrators performing what could be perceived as a safe "tidy-up" operation of resources associated with a deleted cluster could inadvertently delete real data on a live cluster.
This is also highly dangerous if you were to have a staging
and production
cluster (or consider in an A/B rollout, two production clusters) and a user were to delete a KafkaTopic
associated with one of the two clusters, it can have an impact on the other.
We have noticed that Cruise Control and Connectors frequently use the same topic names, and in our own use, we use the same topic names for our applications, and could be impacted by this.
One potential remedy would be to ensure that k8s resources are guaranteed to be uniquely named.
Another would be to put checks in place such that synchronizing a delete from a k8s resource respects the strimzi.io/cluster
label, and does not process a delete from a different kafka cluster's KafkaTopic
resources.
I'm sure there's other solutions as well.
Steps to reproduce
- Spin up a fresh
Kafka
cluster, namedfirst-cluster
, let it come fully up. (Ensure you have thetopicOperator
enabled. It may be helpful to deploy a Kafka UI or similar tooling so that you can interact with the cluster) - Create two new topics from within Kafka itself (I.E. via the UI or a CLI, not via the
KafkaTopic
resource) namedshared-topic
andremnant-topic
- Observe that the
shared-topic
andremnant-topic
KafkaTopic
resources are generated automatically on the Entity operator's next synchronization, and that they have the labelstrimzi.io/cluster: first-cluster
- Spin up a fresh
Kafka
cluster in the same namespace, namedsecond-cluster
, let it come fully up. (Identical configuration to thefirst-cluster
) - Create the same two new topics from within Kafka itself named
shared-topic
andremnant-topic
on thesecond-cluster
- FIRST ISSUE: Observe the logs of the Entity operator for the
second-cluster
. You will see errors relating to synchronizing topics - Delete the
shared-topic
KafkaTopic
resource viakubectl
- SECOND ISSUE: Observe that the
shared-topic
topic is deleted from both clusters, rather than only thefirst-cluster
as the labels would indicate (It may take some time for this to propagate, as it occurs on next synchronization) - Delete the
first-cluster
Kafka
resource, wait for it to be fully cleaned up - Observe that the
remnant-topic
KafkaTopic
remains (as well as any other topics that may have been auto-synced) with the labelstrimzi.io/cluster: first-cluster
- Observe that the errors within the Entity operator for
second-cluster
continue to occur - Delete the
remnant-topic
KafkaTopic
- THIRD ISSUE: Observe the logs in the Entity operator during next sync, the topic will be deleted
- THIRD ISSUE: Observe that the
example-topic
has been deleted fromsecond-cluster
Expected behavior
- Deploying multiple Strimzi
Kafka
resources in the same namespace should be possible without interference - No errors should occur in the Entity operator logs in situations where two different Kafka clusters have the same topics
- Deleting a
KafkaTopic
whose labels indicate the topic belongs to one cluster should have no impact on another cluster. (whether or not that cluster remains online)
Strimzi version
0.33.0
Kubernetes version
1.23.17
Installation method
helm
Infrastructure
Any
Configuration files and logs
No response
Additional context
I recognize that multiple Kafka
resources in the same namespace may not be the "ideal" setup with Strimzi, but I would still classify this as a bug, as I'd gamble that you could potentially reproduce this issue by creating a cluster, creating the topic, deleting the cluster (leaving the KafkaTopic
around) spinning up a new cluster, creating the topic manually in that one, and then subsequently deleting the k8s resource.
This is more or less expected. The general recommendation is to:
- Have a different namespace for each Kafka cluster
- Enable the Topic Operator only in one of them
- Configure the Topic Operators to each watch different namespace
In any case, this is solved in the Unidirectional Topic Operator that will replace the current Topic Operator and remove the bidirectional synchronization. So that is eventually going to fix this.
Thanks! I'll see if we can try out the UTO for now!
Triaged in the community call on 21.9.2023: @tombentley will have some additional look at this issue.
Certainly the operators are not intended to be deployed such that two distinct operators are trying to reconcile the same resource. (This is true for all Strimzi operators).
But given the possibility for data loss as a result of this error, it's reasonable to ask whether we could make things a bit safer.
One idea, which was in the UTO proposal for a while, but was eventually removed, was to include a status.clusterId
. This would be checked by an operator before it started any operation on the Kafka cluster.
- If it was null then the CR status would be updated with the non-null cluster id and the reconciliation would proceed at normal.
- If it non-null and matched the cluster id of the cluster the operator was connected to the operator would proceed as normal with the reconciliation.
- If it was non-null and differed the operator would log an error and simply ignore that resource for the rest of the reconciliation.
This isn't completely bullet-proof. e.g. The Kafka cluster could be silently replaced with a different one without the Admin client saying a word to the application. But it would be much safer than currently. This same idea could also be applied to the user operator, for example.
A similar kind of protection could also be achieved using a status.topicId
field.
Triaged on community call on 5.10.2023: Should be postponed till next call as it is not really clear what the previous comment actually suggests. CC @tombentley
Discussed on the Community call on 19.10.2023: There are no plans to fix this in the Bidirectional Topic Operator as it will be removed in the future. But this can be tracked as an enhancement for the Unidirectional Topics Operator. A proposal should be written to clarify in more detail how the cluster and topics IDs might be used for a protection like this.
The original reproducer for this issue is now obsolete due to the new Topic Operator unidirectional reconciliation. None of the 3 issues mentioned there happens today. That said, the point about using the .status.topicId
as an additional safeguard is valid and should be implemented.