strimzi-kafka-operator
strimzi-kafka-operator copied to clipboard
Topic operator auto regenerate topic after deleting
Describe the bug I create a Kafka cluster with Topic Operator. Then I create "my-topic" by using Kafka Topic crd. But when I use CLI to delete topic, it will be regenerate.
To Reproduce Steps to reproduce the behavior:
- Create Custom Resource 'Kafka'
- Create Custom Resource 'KafkaTopic'
- Go to Zookeeper pod
- Run command '$ bin/kafka-topics.sh --zookeeper localhost:2181 --delete --topic my-topic'
- Kafka Topic 'my-topic' is auto regenerated
Expected behavior Topic is deleted and will not be re generated.
Environment (please complete the following information):
- Strimzi version: 0.25.0
- Installation method: [e.g. YAML files, Helm chart, OperatorHub.io]
- Kubernetes cluster: OpenShift 4.9
- Infrastructure: Baremetal
YAML files and logs
apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
name: my-cluster
spec:
entityOperator:
topicOperator: {}
userOperator: {}
kafka:
authorization:
type: simple
config:
inter.broker.protocol.version: '2.8'
log.message.format.version: '2.8'
transaction.state.log.min.isr: 2
replica.fetch.max.bytes: 41943040
max.message.bytes: 10485760
offsets.topic.replication.factor: 3
listeners:
- authentication:
type: scram-sha-512
name: plain
port: 9092
tls: false
type: internal
- authentication:
type: scram-sha-512
name: tls
port: 9093
tls: true
type: internal
- authentication:
type: scram-sha-512
name: external
port: 9094
tls: true
type: route
metricsConfig:
type: jmxPrometheusExporter
valueFrom:
configMapKeyRef:
key: kafka-metrics-config.yml
name: kafka-metrics
replicas: 3
storage:
class: nfs
deleteClaim: false
size: 5Gi
type: persistent-claim
version: 2.8.0
kafkaExporter:
groupRegex: .*
topicRegex: .*
zookeeper:
metricsConfig:
type: jmxPrometheusExporter
valueFrom:
configMapKeyRef:
key: zookeeper-metrics-config.yml
name: kafka-metrics
replicas: 3
storage:
class: nfs
deleteClaim: false
size: 5Gi
type: persistent-claim
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaTopic
metadata:
name: my-topic
labels:
strimzi.io/cluster: my-cluster
spec:
config: {}
partitions: 10
replicas: 3
topicName: my-topic
Additional context After regeneration, both partitions and replicas of topic are 1.
It looks like your cluster does not have disabled topic auto-creation. So you have to make sure there are no clients using the topic when you delete it, otherwise they would just recreated it with the default settings (1 partition and 1 replica) when they consume / produce. There are several older issues and discussions about this –> please have a look at them.
PS: NFS storage does not work with Kafka, you should use block storage.
It looks like your cluster does not have disabled topic auto-creation. So you have to make sure there are no clients using the topic when you delete it, otherwise they would just recreated it with the default settings (1 partition and 1 replica) when they consume / produce.
I'm pretty sure that there are no clients using the topic. I even tried to create some totally new topics by using KafkaTopic and repeat all the steps, and I get the same issue. The topic is auto regenerated if I deleted it by using CLI or java code.
Well, if the operator recreates it, it would do it with the original settings. There is an easy test for it: disable the topic auto-creation, wait until the brokers roll and try it again. But if it does not help and you think the operator does it, then it would be great if you can provide a DEBUG log from the topic operator to show what it is doing and why.
I tried to remove the Topic Operator when creating Kafka cluster, and now I can delete topic without regeneration.
Hi @nautiam , I tried your use case and it works as expected:
When I use the following command:
$ kubectl exec my-cluster-kafka-0 -- bin/kafka-topics.sh --bootstrap-server :9092 --topic my-topic --delete
The ZK watcher is notified as soon as the delete operation returns and triggers a new reconciliation:
2021-12-06 08:32:17,69936 INFO [ZkClient-EventThread-20-localhost:2181] ZkTopicsWatcher:126 - Topics deleted from ZK for watch 1: [my-topic]
2021-12-06 08:32:17,70834 INFO [ZkClient-EventThread-20-localhost:2181] ZkTopicsWatcher:142 - Topics created in ZK for watch 1: []
2021-12-06 08:32:18,72096 INFO [vert.x-eventloop-thread-1] TopicOperator:576 - Reconciliation #506(/brokers/topics 1:-my-topic) KafkaTopic(test/my-topic): Reconciling topic my-topic, k8sTopic:nonnull, kafkaTopic:null, privateTopic:nonnull
2021-12-06 08:32:18,73110 INFO [OkHttp https://10.96.0.1/...] K8sTopicWatcher:56 - Reconciliation #510(kube =my-topic) KafkaTopic(test/my-topic): event MODIFIED on resource my-topic generation=4, labels={strimzi.io/cluster=my-cluster}
2021-12-06 08:32:19,36817 INFO [OkHttp https://10.96.0.1/...] K8sTopicWatcher:56 - Reconciliation #513(kube -my-topic) KafkaTopic(test/my-topic): event DELETED on resource my-topic generation=4, labels={strimzi.io/cluster=my-cluster}
2021-12-06 08:32:19,76860 INFO [vert.x-eventloop-thread-1] K8sTopicWatcher:60 - Reconciliation #530(kube =my-topic) KafkaTopic(test/my-topic): Success processing event MODIFIED on resource my-topic with labels {strimzi.io/cluster=my-cluster}
2021-12-06 08:32:19,77826 INFO [vert.x-eventloop-thread-1] TopicOperator:576 - Reconciliation #536(kube -my-topic) KafkaTopic(test/my-topic): Reconciling topic null, k8sTopic:null, kafkaTopic:null, privateTopic:null
2021-12-06 08:32:19,77907 INFO [vert.x-eventloop-thread-1] K8sTopicWatcher:60 - Reconciliation #540(kube -my-topic) KafkaTopic(test/my-topic): Success processing event DELETED on resource my-topic with labels {strimzi.io/cluster=my-cluster}
And this is the end result:
$ kubectl get kt
NAME CLUSTER PARTITIONS REPLICATION FACTOR READY
consumer-offsets---84e7a678d08f4bd226872e5cdd4eb527fadc1c6a my-cluster 50 3 True
strimzi-store-topic---effb8e3e057afce1ecf67c3f5d8e4e3ff177fc55 my-cluster 1 1 True
strimzi-topic-operator-kstreams-topic-store-changelog---b75e702040b99be8a9263134de3507fc0cc4017b my-cluster 1 1 True
You can check what happens in your topic-operator logs when you delete that topic and compare with mine.
I tried to remove the Topic Operator when creating Kafka cluster, and now I can delete topic without regeneration.
I'm facing with the same problem of recreating topics after deleting them (both KafkaTopic and topic) in one of my Kafka clusters. auto topics creation is unable but the topics still get created (with the unique specs configured in spec.kafka.config of that Kafka cluster). can you explain what exactly you did in order to make it work fine?
I'm facing with the same problem of recreating topics after deleting them (both KafkaTopic and topic) in one of my Kafka clusters. auto topics creation is unable but the topics still get created (with the unique specs configured in spec.kafka.config of that Kafka cluster). can you explain what exactly you did in order to make it work fine?
At first, do not use topicOperator
apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
name: my-cluster
spec:
entityOperator:
userOperator: {}
Then, write code connect to Kafka Cluster and delete the topic.
@nautiam you don't necessarily need to write code in order to delete a topic. You can simply use the kafka-topics.sh
, which is included in the official Apache Kafka distribution.
I'm facing with the same problem of recreating topics after deleting them (both KafkaTopic and topic) in one of my Kafka clusters. auto topics creation is unable but the topics still get created (with the unique specs configured in spec.kafka.config of that Kafka cluster). can you explain what exactly you did in order to make it work fine?
At first, do not use topicOperator
apiVersion: kafka.strimzi.io/v1beta2 kind: Kafka metadata: name: my-cluster spec: entityOperator: userOperator: {}
Then, write code connect to Kafka Cluster and delete the topic.
Interesting. but it's not the ideal solution for most of the users who wants the topic operator capabilities. Someone has another idea?
@nautiam you don't necessarily need to write code in order to delete a topic. You can simply use the
kafka-topics.sh
, which is included in the official Apache Kafka distribution.
Yes, it might be an option.
Interesting. but it's not the ideal solution for most of the users who wants the topic operator capabilities. Someone has another idea?
I think that we have to update the code of TopicOperator to fix this issue. When I read the code of Kafka delete topic function, I found that it's the future function. It means that although when we call delete topic function, it return success, but it is still deleting the topic actually. And if we call any function related to topic while it's deleting, such as list topic or get topic config, the Kafka cluster will auto generate the topic with default replication. I don't know what exactly the Strimzi code does, but I guess this is the reason. This is why if we create the topic and delete the topic with TopicOperator, we don't meet any issue, but if we create the topic, then push many records to topic for a while, then we delete topic using TopicOperator, we might meet this issue.
👍 I also have this problem. delete.topic.enabled
is true, there is no traffic in my topic, and it get's reversed by the TO. Strangely, it does work sometimes. Here are the TO logs after deleting a topic named test-rr-andrew
.
2022-04-18 12:30:01,45314 INFO [OkHttp https://172.20.0.1/...] K8sTopicWatcher:56 - Reconciliation #13538395(kube -test-rr-andrew) KafkaTopic(kafka/test-rr-andrew): event DELETED on resource test-rr-andrew generation=1, labels={strimzi.io/cluster=event-tracking}
2022-04-18 12:30:01,48327 INFO [vert.x-eventloop-thread-1] TopicOperator:576 - Reconciliation #13538401(kube -test-rr-andrew) KafkaTopic(kafka/test-rr-andrew): Reconciling topic test-rr-andrew, k8sTopic:null, kafkaTopic:nonnull, privateTopic:nonnull
2022-04-18 12:30:01,48346 INFO [vert.x-eventloop-thread-1] TopicOperator:372 - Reconciliation #13538404(kube -test-rr-andrew) KafkaTopic(kafka/test-rr-andrew): Deleting topic 'test-rr-andrew'
2022-04-18 12:30:01,58894 INFO [__strimzi-topic-operator-kstreams-f9f1b9a3-65ac-4f53-91a9-5e2f0f379902-StreamThread-1] K8sTopicWatcher:60 - Reconciliation #13538410(kube -test-rr-andrew) KafkaTopic(kafka/test-rr-andrew): Success processing event DELETED on resource test-rr-andrew with labels {strimzi.io/cluster=event-tracking}
2022-04-18 12:30:01,60149 INFO [ZkClient-EventThread-20-localhost:2181] ZkTopicsWatcher:126 - Topics deleted from ZK for watch 66: [test-rr-andrew]
2022-04-18 12:30:01,60189 INFO [ZkClient-EventThread-20-localhost:2181] ZkTopicsWatcher:142 - Topics created in ZK for watch 66: []
2022-04-18 12:30:02,61090 INFO [vert.x-eventloop-thread-0] TopicOperator:576 - Reconciliation #13538422(/brokers/topics 66:-test-rr-andrew) KafkaTopic(kafka/test-rr-andrew): Reconciling topic null, k8sTopic:null, kafkaTopic:null, privateTopic:null
2022-04-18 12:30:18,09430 INFO [ZkClient-EventThread-20-localhost:2181] ZkTopicsWatcher:126 - Topics deleted from ZK for watch 67: []
2022-04-18 12:30:18,09446 INFO [ZkClient-EventThread-20-localhost:2181] ZkTopicsWatcher:142 - Topics created in ZK for watch 67: [test-rr-andrew]
2022-04-18 12:30:19,10782 INFO [vert.x-eventloop-thread-1] TopicOperator:576 - Reconciliation #13538440(/brokers/topics 67:+test-rr-andrew) KafkaTopic(kafka/test-rr-andrew): Reconciling topic test-rr-andrew, k8sTopic:null, kafkaTopic:nonnull, privateTopic:null
2022-04-18 12:30:19,11918 INFO [OkHttp https://172.20.0.1/...] K8sTopicWatcher:56 - Reconciliation #13538443(kube +test-rr-andrew) KafkaTopic(kafka/test-rr-andrew): event ADDED on resource test-rr-andrew generation=1, labels={strimzi.io/cluster=event-tracking}
2022-04-18 12:30:19,13898 INFO [kubernetes-ops-pool-14] CrdOperator:113 - Reconciliation #13538454(/brokers/topics 67:+test-rr-andrew) KafkaTopic(kafka/test-rr-andrew): Status of KafkaTopic test-rr-andrew in namespace kafka has been updated
2022-04-18 12:30:19,14535 INFO [vert.x-eventloop-thread-1] TopicOperator:576 - Reconciliation #13538463(kube +test-rr-andrew) KafkaTopic(kafka/test-rr-andrew): Reconciling topic test-rr-andrew, k8sTopic:nonnull, kafkaTopic:nonnull, privateTopic:nonnull
2022-04-18 12:30:19,14540 INFO [vert.x-eventloop-thread-1] TopicOperator:743 - Reconciliation #13538468(kube +test-rr-andrew) KafkaTopic(kafka/test-rr-andrew): All three topics are identical
2022-04-18 12:30:19,14569 INFO [vert.x-eventloop-thread-1] K8sTopicWatcher:60 - Reconciliation #13538473(kube +test-rr-andrew) KafkaTopic(kafka/test-rr-andrew): Success processing event ADDED on resource test-rr-andrew with labels {strimzi.io/cluster=event-tracking}
I experience the same problem. Topics are regenerated with the original configuration. To me it looks like the two-way sync of the topic operator cannot keep up with the deletion of the KafkaTopic
resources and regenerates them because it thinks they're missing. I only experience this when deleting many KafkaTopic
s at once.
For instance, I just deleted 446 resources and 157 got regenerated.
I experience the same problem. Topics are regenerated with the original configuration. To me it looks like the two-way sync of the topic operator cannot keep up with the deletion of the
KafkaTopic
resources and regenerates them because it thinks they're missing. I only experience this when deleting manyKafkaTopic
s at once. For instance, I just deleted 446 resources and 157 got regenerated.
As with others, please meke sure you have the topic autocreation disabledas that can cause all kind of issues. But yes, there seems to be some bug which causes this to happen even without the auto-creation.
@chaehni I can confirm that.
I can reproduce the issue you describe consistently when running a load test which creates a bunch of test topics (e.g. 20). After a successful test run, I do a bulk topic deletion to get rid of these test topics and I hit the issue. The TO recreate them almost immediately with the same configuration but empty (the topicId is different).
I guess we are trigger some reconciliation logic edge case here, which needs to be investigated further. At least we seem to have a reproducer.
Possible workaround: if you look at the TO logs, you may find InvalidStateStoreException
warnings. In my case, I found that simply restarting the TO pod before the bulk topic deletion fixes the issue.
@scholzj , but is there a need for TO to be present at all ? , I have the auto.create enabled for topics , I simply cannot disable because my setup is like a central kafka cluster where multiple environments connect to it , topics are just prefixed by namespace , i cannot simply create each topic for every environment , i am thinking of removing TO completely because as part of removing the environment i delete the topics as well , but this TO is saving them and in the end recreating them everytime i delete the topics which are no longer needed for me ,
@hari819 Honestly, sounds like a very bad practice. How do you know what is inside the topics? How do you track which topics are actually needed and which were just created by some mistake? How do you track that the topics have the right settings? Sounds like a mess to me. Normally, disabling the topic auto-creation and have some central management sounds like a day one thing. (regardless whether you use Topic Operator for the management - its more about autocreation being a mess rather than TO being something super-amazing)
That said, the Topic Operator is optional. So you can easily disable it -> just remove the topicOperator
section from the Kafka custom resource.
@scholzj , yes i get it that i wont be able to run "k get KT" and all those commands because once I removed TO , i do not see anything in the CR ,KafkaTopics , but i will make this change only in my development/testing Kafka cluster , the problem is we have nearly 250 odd environments up all time(with 30K topics making the cluster bulky) , we do have housekeeping job to take care of topic deletion during when a member finishes his dev/test along with deleting the environment itself , but my PVCs are getting full once a week because of TO reconciling the topics , and most important thing is once testing/development is done there is no use of the topics and their data , the next time developer creates a new namespace , the topics will get created on the central kafka cluster with a different prefix , so i am just deleting the data which is not required at all , I would like to keep the TO in place for all other environments like PREPROD/PROD , thanks for the quick response and the suggestions.
@scholzj , please can you suggest if there is an alternative for autocreation of topics in such a scenario with these many environments ?
@scholzj
and now when i remove TO from the cluster definition , i am not able to delete topics at all , Odd , I am just running this command in loop to remove the topics for each environment ,
${KAFKA_HOME}/bin/kafka-topics.sh --bootstrap-server kafka-service:9092 --delete --topic $topic-name
the same command was able to delete topics when TO was enabled , only problem was TO was reconciling the deleted topics ,
I am stuck with this now , I am on version 0.27.1 with Kafka version 2.8.1
yes i get it that i wont be able to run "k get KT" and all those commands because once I removed TO , i do not see anything in the CR ,KafkaTopics
No, that is not what I'm saying. You will have no idea about the topics because you will not know which is auto-created by mistake or by some random one-of app or which is created intentionally and actually used. This has nothing to do with whether you can do kubectl get kafkatopics
or not.
@scholzj , yes if we remove TO ,we are missing info about topics , who is the owner , and all valuable info . I am thinking i should put back and have the automatic topic creation disabled , seems like no other go for me ,
Kafka has its own APIs to wotk with topics. You can list them using those APIs even without the Topic Operator. There are also many other tools for managing topics.
Kafka has its own APIs to wotk with topics. You can list them using those APIs even without the Topic Operator. There are also many other tools for managing topics.
thankyou @scholzj , i will try to come up with a generic utility container which will take care of the topic creation using the APIs
Triaged on 2.8.2022: There are similar reports, so we should keep this as a bug. It is currently not clear what is causing it.
@scholzj is there any information we can provide that would help show what is causing the issue?
I guess @tombentley would be the expert on Topic Operator who might know.
Any news on this? Currently facing the same issue, I delete a couple KafkaTopics k8s resources and they get recreated with default config (1 partition).
I have auto create topics enabled, however I can see that those topics have no produce or consume activity
You should disable the auto-creation then.
The Bidirectional Topic Operator (BTO) has been replaced by the new Unidirectional Topic Operator from Strimzi 0.39. There are no plans to fix any outstanding issues in the old BTO and this issue can be closed.