[Bug] The tenant's StatefulSet replicates the Tenant's metadata
When the operator syncs the tenant, it copies the labels and annotations from the Tenant to the StatefulSet, causing undesirable side effects.
Expected Behavior
Similar to how the Service metadata are synced, the labels and annotations that should be cascaded to the StatefulSet should be explicitly specified in the CRD.
Current Behavior
https://github.com/minio/operator/blob/03613d29e55730ff75f2dbfa70a15e2c4c7ea9db/pkg/resources/statefulsets/minio-statefulset.go#L630-L632
Possible Solution
- Adding a section to the CRD called
poolsMetadatafor the labels and annotations that are shared across all pools. - Adding a boolean to the CRD called
cascadeTenantMetadatathat controls the lines in the snippet above.
Steps to Reproduce (for bugs)
- Create Tenant object with labels
- Watch the labels get cascaded to the StatefulSet
Context
I opened #2287 before investigating the issue but now I think I have more context to discuss the issue.
Because the operator cascades the argocd label to the stateful set, cilium thinks that this label is a security-relevant label, and uses it to create endpoints for the pods and the identity object that governs these endpoints. The issue appears because the identity object has the argocd label, which makes argo thinks that it is a top level object, but when it compares it with the actual application manifest, it does not find it there. Therefore, argo will think that this object was removed from the manifest and will try to remove it from the cluster as well.
We're using many other operators such as Percona, ECK, PGO, but we only see this behavior with the Minio Operator. Once this issue is fixed, the operator can be declared GitOps friendly.
Regression
The issue first appeared in #295
Your Environment
- Version used (
minio-operator): 6.0.2 - Environment name and version (e.g. kubernetes v1.17.2): GKE 1.29
- Server type and version:
- Operating System and version (
uname -a): - Link to your deployment file:
AFAIK, there is no official statement whether labels and annotations should be propagated to child objects. Some people prefer this behavior (link), but a Kubernetes dev mentions that it's not desirable that labels propagate (link).
Not propagating labels and annotations would be an easy change and we should add a statefulsetMetadata field that will hold the annotations/labels for the generated statefulset. However this may break existing installations, so we can't make this change that easy. It would require a migration path:
- Add the
preventMetadataPropagationfield (defaults tofalse) to theTenantCRD that will disable propagation of annotation/labels. - Add the
statefulsetMetadatafield in theTenantCRD that will add annotations/labels to the generated statefulset.
In the next major release of the operator the preventMetadataPropagation field can be removed, because we can only have these kind of breaking changes in major releases.
I agree with you on the solution. Let me know if there's anything I can do to help.
we do bunch of fetching and queries using labels as selectors for queries removing them will indeed break stuff.
@ramondeklein @dvaldivia please discuss this together to figure out a proper solutions since this might break existing integrations.
Alternative fix, see https://github.com/minio/operator/pull/2337
@ramondeklein This is half a solution. In the new major version, this is how it will look like (with adding {} as the default value). But a corrective action should be taken to prepare for it (i.e. the preventMetadataPropagation)
In other words, the propagation should be opt-out in v6 and opt-in in v7.
Still, I think your PR should be merged.
@ramondeklein Once your PR is merged, we could bring up the discussion on the path to make {} as the default value.
@aqeelat If we could start from scratch, then {} would be the default value. But we don't want to break existing installation from customers that upgrade the operator. This opt-out should work fine and I prefer doing this to a new property that opts-out.