strimzi-kafka-operator
strimzi-kafka-operator copied to clipboard
NPE when incorrect topologySpreadConstraints are applied on a fresh cluster
Describe the bug
I was trying out topologySpreadConstraints on our testing environment. I added the following configuration for broker:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: ScheduleAyway
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- kafka.broker
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: ScheduleAyway
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- kafka.zookeeper
The second item in the constraints array was a mistake and was supposed to be applied to zookeeper. Nonetheless, this is what I observed:
- Update on existing kafka cluster:
- No effect
- But, haven't tested scaling out brokers. Suspect it would fail.
- Creating new kafka cluster with this config:
- Operator throws an NPE
- I suspect it has to do with this line where it gets the pod infro using the fabric8 client, but the metadata is missing since it's looking up the wrong label?
NPE stacktrace:
2022-06-27 06:44:46 ERROR AbstractOperator:247 - Reconciliation #18(timer) Kafka(kafka/kafka-cluster): createOrUpdate failed
java.lang.NullPointerException: null
at io.strimzi.operator.cluster.operator.resource.KafkaRoller.restartAndAwaitReadiness(KafkaRoller.java:637) ~[io.strimzi.cluster-operator-0.27.1.jar:0.27.1]
at io.strimzi.operator.cluster.operator.resource.KafkaRoller.restartIfNecessary(KafkaRoller.java:364) ~[io.strimzi.cluster-operator-0.27.1.jar:0.27.1]
at io.strimzi.operator.cluster.operator.resource.KafkaRoller.lambda$schedule$6(KafkaRoller.java:277) ~[io.strimzi.cluster-operator-0.27.1.jar:0.27.1]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) ~[?:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
at java.lang.Thread.run(Thread.java:829) [?:?]
This was no longer the issue once the zookeeper topology constraint was removed from broker config, but I think we should handle this error better instead of throwing an NPE.
Can you please share the full Kafka CR to reproduce this? Without the full operator, it is also quite hard to understand what exactly happened. Are you saying that by using this topologySpreadConstraints I get the error right in the first reconciliation?
Are you saying that by using this topologySpreadConstraints I get the error right in the first reconciliation?
Yes
Debug level logs leading up to the NPE: extract-2022-06-30T17_58_15.818Z.csv
Kafka CR: cr.log
Note: Some sensitive info has been redacted from CR/logs
Great, thanks. I will have a look.
This still seems to be a problem in the current main when StatefulSets are used. Just the stacktrace changed a bit:
2022-06-30 22:13:06 INFO KafkaRoller:291 - Reconciliation #1(watch) Kafka(myproject/my-cluster): Could not roll pod {my-cluster-kafka-2}, giving up after 10 attempts. Total delay between attempts 127750ms
java.lang.NullPointerException: null
at io.strimzi.operator.cluster.operator.resource.KafkaRoller.restartAndAwaitReadiness(KafkaRoller.java:634) ~[io.strimzi.cluster-operator-0.30.0-SNAPSHOT.jar:0.30.0-SNAPSHOT]
at io.strimzi.operator.cluster.operator.resource.KafkaRoller.restartIfNecessary(KafkaRoller.java:338) ~[io.strimzi.cluster-operator-0.30.0-SNAPSHOT.jar:0.30.0-SNAPSHOT]
at io.strimzi.operator.cluster.operator.resource.KafkaRoller.lambda$schedule$6(KafkaRoller.java:276) ~[io.strimzi.cluster-operator-0.30.0-SNAPSHOT.jar:0.30.0-SNAPSHOT]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) ~[?:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
at java.lang.Thread.run(Thread.java:829) ~[?:?]
When StrimziPodSets are used, this particular Kafka CR also ends up with other errors. But even then, other situations might trigger this.
When the pod does not exist, we for sure con't need to reconfigure or restart it. So we should probably wait for readiness? When the pod does nto exist, it will anyway fail sooner or later on readiness. Other alternative might be to just kill the reconciliation right away and throw some error? Also, before it runs into the rolling part, it is trying to connect for 5 minutes to collect the configuration which seems also unnecessary. This is caused by the pod being non-existent.
Might be best to fix after #6663 is merged to avoid conflicts. The NPEs are not nice, but technically, this would anyway end in an error, just a different one. So I don't think it matter that much if it is fixed now or in few days.