operator-lifecycle-manager icon indicating copy to clipboard operation
operator-lifecycle-manager copied to clipboard

PullSecret management for operators and their workload

Open thetechnick opened this issue 2 years ago • 9 comments

Type of question

Best practice, general context/help :)

Question

What did you do? We want to deploy a large operator with multiple dependencies from a private repository, the structure looks like this:

+--------------+         +--------------+
|Private Repo 1|         |Private Repo 2|
+----^---------+         +----^---------+
     |                        |
+----+----+              +----+----+<------------------------+
|Catalog 1|              |Catalog 2|                         |
+----^----+              +----^----+<--------+               |
     |                        |              |               |
+----+----+              +----+-----+   +----+-----+   +-----+----+
|Top level|  dependency  |Operator 1|   |Operator 2|   |Operator 3|
|Operator +------------->+----+-----+   +--^---+---+   +---^------+
+---------+                   | dependency |   |dependency |
                              +------------+   +-----------+

Now it's easy for us to add pullSecrets to the CatalogSources of Catalog 1 & 2, as described here: https://olm.operatorframework.io/docs/tasks/make-catalog-available-on-cluster/#using-registry-images-that-require-authentication-as-catalogbundleoperatoroperand-images

But we have a hard time to figure out how to best setup the pull secrets for the operand workload itself.

  1. We cannot use the cluster global pull secret, because we already have differently scoped credentials for the same registry (quay.io) included. https://docs.openshift.com/container-platform/4.7/openshift_images/managing_images/using-image-pull-secrets.html#images-update-global-pull-secret_using-image-pull-secrets Even if we could use the global pull secret, we would very much prefer to have the pull secrets scoped to just a single namespace/stack, as we want to deliver operator stacks, like this multiple times into the same cluster, which makes managing these secrets in a central place a bit tricky.

  2. Every Operator get's it's own ServiceAccount to isolate RBAC, so patching the default ServiceAccount will not give us pull permissions for every operand.

  3. Patching every Operator in the dependency chain to specify imagePullSecrets as part of their deployment specs within their CSV and to ensure that these Operators deploy all their pods with this set explicitly is very prone to errors and a lot of work.

What did you expect to see? A simple UX to handle private registries via OLM. e.g. by being able to specify pull secrets as part of a Subscription, similar to a CatalogSource, so the pullSecrets are added to ServiceAccount created by OLM?

What did you see instead? Under which circumstances? Manual patching after installation or massive amount of work to handle pullSecrets within a whole product chain.

Environment

  • operator-lifecycle-manager version:

  • Kubernetes version information: OpenShift 4.9.22

  • Kubernetes cluster kind: Manged OpenShift / OSD / ROSA / ARO

Additional context Ref, similar question: https://github.com/operator-framework/operator-lifecycle-manager/issues/2307

thetechnick avatar Mar 04 '22 14:03 thetechnick

@thetechnick in your example you are suggesting specifying the pull secret via the subscription, but that is also problematic as the subscriptions for the dependent operators (in all levels of the dependency chain) are created by OLM itself (as part of dependency resolution) which mean we have no way to specify a pull secret in them.

One way to solve this can be to provide a way to instruct OLM to copy the pull secret information from subscription to subscription throughout the dependency chain.

An alternative way, to handle the entire problem, can be to have a namespace scoped global pull secret (same as the global one, but scoped to a single namespace), which will take effect for all pull operations inside the namespace. I believe this can solve 90% (which include all of the common use cases)

nb-ohad avatar Mar 05 '22 21:03 nb-ohad

@thetechnick did you attempt to put in entries in the global pull secret scoped to a particular namespace in the respective registry, e.g. quay.io/somenamespace/

dmesser avatar Mar 07 '22 11:03 dmesser

@dmesser Can you point me to any documentation for this? I could not find any hint about sub-scoping pull credentials in the global pull secret, but I didn't try it out yet. :) https://docs.openshift.com/container-platform/4.7/openshift_images/managing_images/using-image-pull-secrets.html

thetechnick avatar Mar 08 '22 10:03 thetechnick

@thetechnick check out the current 4.9 docs for coverage: https://docs.openshift.com/container-platform/4.9/openshift_images/managing_images/using-image-pull-secrets.html#images-allow-pods-to-reference-images-from-secure-registries_using-image-pull-secrets

dmesser avatar Mar 08 '22 11:03 dmesser

@dmesser Thank you very much! Didn't know this is possible. That's super useful, but not quite solving our case here.

When we deploy multiple of these private stacks via automation, we have to add and remove multiple of these secrets to the global pull secret, it would be way easier for us to manage if we could use pull secrets scoped to a namespace/installation. What we don't like about patching the global pull secret in this case:

  • We are afraid that multiple controllers changing the global pull secret might override each other
  • Changing the global pull secret will modify all nodes of the cluster, which is something we would like to avoid
  • It is possible to have collisions between credentials in the global pull secret
  • We have no status on the global pull secret and patching it does not take immediate effect, leading to timing errors

thetechnick avatar Mar 08 '22 12:03 thetechnick

@thetechnick Yeah, makes a lot of sense. In this case just attach pull secrets to local service accounts in the namespace.

dmesser avatar Mar 08 '22 12:03 dmesser

@dmesser Yes you are right, we had the same idea yesterday. @nb-ohad is looking into that now. It still requires a change to every single bundle throughout their stack, hardcoding them to use a specific secret.

As OLM is managing and abstracting the installation of operators I would expect OLM to offer something to manage these pull secrets in a standardized fashion, without package authors having to document a hardcoded secret name in their bundles.

Maybe we can just document the hardcoded pull Secret reference as best practice for OLM v1, but it would be really nice if the new OLM v2 APIs would provide a better UX.

thetechnick avatar Mar 08 '22 12:03 thetechnick

@thetechnick We discussed in the past what we could do with the current separation of controllers, the options weren't great (distribute the catalog Secret in all watch namespaces of the operator).

dmesser avatar Mar 10 '22 20:03 dmesser

We've looked at this closer and there is probably a good middle ground we can land on in propagating the the secrets to the operator controller pods, since their definition is something that OLM is owning. See https://issues.redhat.com/browse/OLM-2457 for further details.

dmesser avatar Mar 11 '22 15:03 dmesser