scylla-operator
scylla-operator copied to clipboard
Scylla cleanup jobs are not configurable
What should the feature do?
Allow configuration of cleanup jobs.
What is the use case behind this feature?
After scylla maintenance, cleanup jobs started, but they bounced back from registry with 401 error. We are using linkerd service mesh (but any SM will probably be a problem) and because of that, pods created by cleanup jobs have a sidecar injected. As scylla cleanup jobs do not have imagePullSecrets property specified, they will always bounce back.
I know you are using the same image for cleanup jobs as scylla-operator does, but still, it caused us a lot of problems just because those cleanup jobs are not configurable
This chunk of code is probably what I'm talking about - link
Maybe cleanup jobs could inherit imagePullSecrets property from racks, as they do with affinity? Changing configuration via values would be more useful tho.
We are using linkerd service mesh (but any SM will probably be a problem) and because of that, pods created by cleanup jobs have a sidecar injected.
just to be clear which image fails to be pulled? operator image or the injected sidecar image?
The injected sidecar image
The injected sidecar image
shouldn't the injection inject the pullSecret as well?
ImagePullSecrets is a Pod level configuration, but I've found some articles about such injections. However, everything worked until now. I think scylla upgrades shouldn't cause backward incompatibilities. Also, this lack of configuration options caused us other problems with podAntiAffinities earlier.
I think it's valid to be able to configure affinities and pullSecrets for cleanup jobs as any other part in the operator. You could mirror operator images to a private registry and this should be configurable so it works. That may coincidently help you fix the deficiency of the admission controller. At the same time I feel like any admission controller that injects containers shall be able to inject additional pullSecret into the same API object (Pod) and should also have a corresponding controller to ensure a secret is present in that namespace.
I haven't found such an option in Linkerd configuration for sidecars - it may be an issue for maintainers of this software. However, we haven't had such a problem with other pods from many providers, so it's the first issue where a pod hasn't configured imagePullSecrets. On the other hand, the used affinity for this pod is preferredDuringSchedulingIgnoredDuringExecution which doesn't guarantee that this pod will be scheduled on a node with the proper docker image downloaded earlier, so still the lack of imagePullSecrets in this job might be a problem and addition of this option will fix our issue and potentially other problems.