strimzi-kafka-operator
strimzi-kafka-operator copied to clipboard
[Spike] Checking if replacing Topic Operator Kafka Streams storage with self-managed Kafka persistence passes system tests
Type of change
Select the type of your PR
- Refactoring
Description
The TO's usage of Kafka Streams has led to repeated bug reports, hopefully replacing Kafka Streams with a simpler implementation (that still persists to Kafka) will be an improvement. But, I want to run the system tests against these code changes, somewhere other than my laptop.
Checklist
Please go through this checklist and make sure all applicable tasks have been done
- [ ] Write tests
- [ ] Make sure all tests pass
- [ ] Update documentation
- [ ] Check RBAC rights for Kubernetes / OpenShift roles
- [ ] Try your changes from Pod inside your Kubernetes and OpenShift cluster, not just locally
- [ ] Reference relevant issue(s) and close them after merging
- [ ] Update CHANGELOG.md
- [ ] Supply screenshots for visual changes, such as Grafana dashboards
/azp run regression
Azure Pipelines successfully started running 1 pipeline(s).
@LiamClarkeNZ these changes work fine on my local environment (minikube) and on a remote cluster (OCP 4.10). I ran some manual tests and the reproducer described in ENTMQST-4087 without any issue.
I also had a peak at the implementation and I have a couple of suggestions:
- The internal topics naming is confusing. I would propose
__strimzi_topic_commandsand__strimzi_topic_states, which better reflect their content. - Session.stop is never called in case of graceful shutdown, so we don't wait for inflight reconciliations. You can fix by invoking Vertx.close from a shutdown hook.
Some system tests are failing.
[ERROR] Errors:
[ERROR] TopicST.testCreateDeleteCreate:240 » Kafka Failed to create new KafkaAdminClie...
[ERROR] TopicST.testCreateTopicViaAdminClient:170 » Kafka Failed to create new KafkaAd...
[ERROR] TopicST.testDeleteTopicEnableFalse:378 » NullPointer
[INFO]
[ERROR] Tests run: 8, Failures: 0, Errors: 3, Skipped: 2
Thanks very much @fvaleri :) I'll have a look into those STs and see what's failing, and will incorporate your feedback into further changes. That Session.stop issue must be affecting the current implementation too, as I integrated with the existing approach of a collection of Closeable elements that are closed in Session.stop
@LiamClarkeNZ The Session.stop issue is addressed in https://github.com/strimzi/strimzi-kafka-operator/pull/7165, so you don't need to worry about that.
Kia Ora @LiamClarkeNZ. Thanks for this, I know it has been open for a long time. I've started working on the TO lately, with a view to making it KRaft compatible. It's essentially a rewrite. As part of that I'm planning to use a compacted topic and in-memory Map for holding a different representation of topic state. It should be possible to back-port that to ZK-based TO. Therefore I think it's better to wait for that than end up forcing people into two store migrations.