strimzi-kafka-operator
strimzi-kafka-operator copied to clipboard
At times newly created KafkaTopic resource comes into ready state only after the periodic reconciliation
Describe the bug I'm using topic-operator module only without the cluster operator. I see this intermittently that when the kafkatopic resources are created, it doesn't come into ready state until the next periodic reconciliation is run.
To Reproduce topic-operator-logs.txt
Steps to reproduce the behavior:
- Create a kafkatopic resource.
- Try to connect to the topic from a kafka client and you will get an "UNKNOWN_TOPIC_OR_PARTITION" error.
- Try to connect after 10 to 15 mins and it is able to connect.
- Do a describe of the kafkatopic and it shows an interval of few minutes between "Creation Timestamp" and "Last Transition Time".
Name: oceana.interval.topic
Namespace: avaya-kafka
Labels: app.kubernetes.io/managed-by=Helm
operator.io/kind=topic
Annotations: meta.helm.sh/release-name: orca
meta.helm.sh/release-namespace: default
API Version: kafka.strimzi.io/v1beta1
Kind: KafkaTopic
Metadata:
Creation Timestamp: 2021-03-08T15:15:07Z
Generation: 1
Resource Version: 1455674
Self Link: /apis/kafka.strimzi.io/v1beta1/namespaces/avaya-kafka/kafkatopics/oceana.interval.topic
UID: 9a6aee77-be54-448e-a617-9851af9fdbbd
Spec:
Config:
retention.ms: 300000
Partitions: 1
Replicas: 1
Topic Name: oceana.interval.topic
Status:
Conditions:
Last Transition Time: 2021-03-08T15:21:01.687956Z
Status: True
Type: Ready
Observed Generation: 1
Events: <none>
Expected behavior Kafka client should be able to connect to the topic within few seconds of being created.
Environment (please complete the following information):
- Strimzi version: 0.21.0
- Installation method: Helm chart
- Kubernetes cluster: Kubernetes 1.17.9
- Infrastructure: Kubernetes on VmWare
YAML files and logs
Attached the log file which shows the added event for the topic "oceana.interval.topic" coming in at "2021-03-08 15:15:07" but the reconciliation and the creation of the topic happening only on "2021-03-08 15:21:01" after the periodic reconciliation kicks in.
CC @tombentley @sknot-rh
Is it possible to provide logs at DEBUG level? If I am reading the logs correctly, you created the oceana.interval.topic
topic (it seems to be created in the Kafka too), then deleted it (15:14:51) and created again (15:15:07). After the second creation it is not recreated in the Kafka. @tombentley is 16 seconds enough to delete a kafka topic from broker?
Hi,
Looks like this is happening only when the create is happening immediately after delete.
I'm attaching the logs in DEBUG level. The topic 'testtopic' was deleted on '2021-04-20 17:08:38' and then created back on '2021-04-20 17:08:52'. topic-operator.txt
Hi,
Any plans to work on this in the near future?
I think this kind of race will become a lot more easy to detect once Kafka's support for topic ids matures and topic ids can be accessed via the Admin client. KAFKA-10774 in particular would be beneficial. Hopefully that will be merged for Kafka 3.1, and perhaps we could start conditionally using it then (by conditionally I mean if the broker supported it, and falling back to the current behaviour if not), though it's possible that it's not worth the complication and we'd decide it was simpler to wait until Strimzi dropped support for 3.0.
Hi, I see that 3.1 is released. Do you have plans to add the conditional approach you mentioned above to get this addressed if Kafka version 3.1 being used?
Thank You!!
Triaged on 21.7.2022: Seems to be still a bug. should be kept opened.
The Bidirectional Topic Operator (BTO) has been replaced by the new Unidirectional Topic Operator from Strimzi 0.39. There are no plans to fix any outstanding issues in the old BTO and this issue can be closed.