strimzi-kafka-operator
strimzi-kafka-operator copied to clipboard
ClusterRoleBindings for `kafka-init` to `strimzi-kafka-broker` role may not be removed
Describe the bug
When a Kafka
has been deleted during reconciliation, the delete reconciliation event may time-out before completion of an in-process add/update. As a result, the delete logic ([1] and [2]) may not run, leaving behind the CRB for kafka-init.
To Reproduce Steps to reproduce the behavior:
- Create a Kafka resource
- After the reconciliation has started and the kafka-init CRB has been created, delete the Kafka resource
- Updating the (deleted) resource status will fail and the kafka-init CRB remains
Expected behavior Removal of a Kafka resource removes all resources associated with the Kafka, include any cluster-level resources.
Environment (please complete the following information):
- Strimzi version: 0.26.0
- Installation method: OLM bundle
- Kubernetes cluster: OpenShift 4.9
- Infrastructure: OSD
YAML files and logs
Will be provided
Additional context [1] https://github.com/strimzi/strimzi-kafka-operator/blob/e0ed9de80a23f0df577f5db18eb319360b1e361f/operator-common/src/main/java/io/strimzi/operator/common/AbstractOperator.java#L261-L272 [2] https://github.com/strimzi/strimzi-kafka-operator/blob/e0ed9de80a23f0df577f5db18eb319360b1e361f/cluster-operator/src/main/java/io/strimzi/operator/cluster/operator/assembly/KafkaAssemblyOperator.java#L660-L664
This is pretty much intentional - or maybe not intentional but known. For global scoped resources, garbage collection cannot be used (because owner reference s allowed only within the same namespace). So Kubernetes will not delete it. And deleting it in 100% cases would be fairly complicated and troublesome. So the current state when it can in some situations leak is based on a informed decision.
Understood about the ownership of global resources. I was looking to explore the additional complication for removing it in cases like this. Is the thinking that whatever is deleting the Kafka resource should be aware of this and remove the CRB itself? I'm not opposed to something like that, but if a nice solution can be in the Kafka operator somehow, it would certainly be preferred.
Well, you have in general two options:
- Use finalizers to block the deletion and delete everything properly when it comes to it. Finalizers are a bit dangerous and can easily screw up things.
- Monitor the resources in the reconciliation and delete them later even if the initial deletion failed.
Both seemed like a relatively lot of effort given the actual value it would bring.
What are your thoughts about somehow prioritizing DELETED
events in general? For example, a reconciliation for the DELETED
action might wait for the reconciliation lock without a timeout (or a longer timeout) since it's a terminal event and a subsequent event for the resource would never be received. It certainly wouldn't cover 100% of any potential leakage, but may reduce it significantly enough to be worthwhile.
My thoughts would be that its not easy to achieve given the current design - and since it anyway does not help on 100%, it is questionable what the value is.
I was thinking that it would work to add an OptionalLong lockTimeout
to the Reconciliation
object which would be set to some higher value for DELETED
events. DELETED
would move to its own case in OperatorWatcher
. Then AbstractOperator#withLock
would use reconciliation.lockTimeout().orElse(lockTimeoutMs)
to acquire the lock.
The benefit is that it keeps these cleanup events in the scope of the operator for most scenarios. The one I can think of that would be an issue is an operator restart while the DELETED
event is waiting. While not 100%, it increases the likelihood of success I think. If it seems reasonable, I'd be glad to put together a PR.
Another "best effort" would be to set the Namespace
of the Kafka instance in the ownerReferences
of the cluster role binding. That would allow it to be garbage collected when the namespace is removed.
I guess that would be an option. But I guess we would need to read the namespaces to get the namespace UID for the owner reference. o we would need additional global scoped RBAC which means a lot of complications in installation and lot of questions about why is it needed and if it can be disabled.
we would need to read the namespaces to get the namespace UID for the owner reference. o we would need additional global scoped RBAC which means a lot of complications in installation and lot of questions about why is it needed and if it can be disabled.
Sure, I can appreciate that. I was looking at using the CRB template, but it seems the additional metadata properties are not mapped to the created CRB. Was it left out intentionally or could that be added?
Sorry, not sure WDYM with CRB templae or which metadata properties are missing. Can you provide more details?
In the KafkaClusterTemplate
there is a ResourceTemplate
for CRB creation.
https://github.com/strimzi/strimzi-kafka-operator/blob/db412a742d96e2e0709e16abec097bfba7fa06b0/api/src/main/java/io/strimzi/api/kafka/model/template/KafkaClusterTemplate.java#L51
ResourceTemplate
and the contained MetadataTemplate
both contain a Map
of additional properties, but currently only the annotations and labels are retrieved from the custom resource:
https://github.com/strimzi/strimzi-kafka-operator/blob/db412a742d96e2e0709e16abec097bfba7fa06b0/cluster-operator/src/main/java/io/strimzi/operator/cluster/model/KafkaCluster.java#L435-L438
Then, they're mapped to the CRB: https://github.com/strimzi/strimzi-kafka-operator/blob/db412a742d96e2e0709e16abec097bfba7fa06b0/cluster-operator/src/main/java/io/strimzi/operator/cluster/model/AbstractModel.java#L1708-L1718
I'm suggesting that the additional properties be mapped in the same way. The CRD also may need to be updated to preserve them. I suppose it would be best to follow the same pattern for the other templates that map labels/annotations.
The additionalProperties
field is there when decoding a JSON / YAML which contains some unknown fields. It is not something what you can use to set some additional properties. It is basically just an internal trashcan where any insupported field will be end up and will be ignored. They work exactly as they were designed to work.
They are serialized as well and actually take precedence over same-named fields in the resource. It seems on the surface like a convenient vehicle for a client to pass arbitrary metadata, with the understanding/warning that they take full responsibility for what they're doing...
Triaged on 18.8.2022: This is a valid issue. We just need to find a solution which does not cause more problems then the initial issue.
Thank you. Perhaps another option along the lines of my last comments may be to add a List<OwnerReference>
to MetadataTemplate
. That would allow clients to more cleanly set the namespace (or other cluster-scoped resource) as the owner of the CRB as necessary.
Also, I do think there is still value in discussing if/how the "delete" event can always be processed. Particularly for things like this that can't/won't be handled by GC.
Well, TBH, we triaged the original issue about the deletion of the bindings. Not any enhancements to the template fields. I do not think we plan any owner reference configuration there.