serving icon indicating copy to clipboard operation
serving copied to clipboard

Proposal: Support using centrally-managed ImagePullSecrets from knative-serving namespace to resolve image tag digest

Open akumar-47 opened this issue 2 months ago • 6 comments

/area API /kind feature

Describe the feature

Currently, Knative controllers expect imagePullSecrets to be present in the user namespace where a Knative Service/Revision is deployed. This is required for image tag → digest resolution.

In multi-tenant environments, this creates challenges:

  • Duplication: Org-managed registry credentials must be copied into every namespace.
  • Exposure: All namespace tenants can read the secrets.
  • Operational overhead: Rotating/updating secrets requires syncing across multiple namespaces.

Proposal

Introduce a mechanism to allow Knative controllers to use a centrally-managed imagePullSecret from the knative-serving namespace (or a configurable global namespace), rather than requiring it in each workload namespace.

Example (per-service opt-in via annotation):

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: my-service
  namespace: user-namespace
  annotations:
    serving.knative.dev/use-global-pull-secret: "true"
spec:
  template:
    spec:
      containers:
        - image: private.registry.io/app:latest

Expected behavior

If the annotation/label is present, Knative controller reads the global imagePullSecret from knative-serving service account for image resolution.

If not set, existing behavior continues (secret must be present in workload namespace).

NOTE: At the host level, containerd is already configured with a secret for private registry authentication, which allows containerd itself to successfully pull images.

Benefits

  • Simplifies secret management in multi-tenant clusters.
  • Reduces duplication of secrets across namespaces.
  • Improves security by avoiding exposing secrets directly to namespace tenants.

akumar-47 avatar Oct 01 '25 09:10 akumar-47

something like this

func (c *Reconciler) reconcileDigest(ctx context.Context, rev *v1.Revision) (bool, error) {
	totalNumOfContainers := len(rev.Spec.Containers) + len(rev.Spec.InitContainers)

	// The image digest has already been resolved.
	// No need to check for init containers feature flag here because rev.Spec has been validated already
	if len(rev.Status.ContainerStatuses)+len(rev.Status.InitContainerStatuses) == totalNumOfContainers {
		c.resolver.Clear(types.NamespacedName{Namespace: rev.Namespace, Name: rev.Name})
		return true, nil
	}

	cfgs := config.FromContext(ctx)
	logger := logging.FromContext(ctx)

	// Determine resolution policy from annotation
	useRevisionSA := rev.Annotations["serving.knative.dev/use-global-pull-secret:"] == "true"
	var namespace, serviceAccount string
	var imagePullSecrets []string

	if useRevisionSA {
		// Use SA and secrets from the revision
		namespace = rev.Namespace
		serviceAccount = rev.Spec.ServiceAccountName
		for _, s := range rev.Spec.ImagePullSecrets {
			imagePullSecrets = append(imagePullSecrets, s.Name)
		}
	} else {
		// Use controller SA and secrets from knative-serving
		controllerNS := system.Namespace()
               //Hardcoded the SA name for now according to my environment
		controllerSA := "controller"  
		controllerSAObj, err := c.kubeclient.CoreV1().ServiceAccounts(controllerNS).Get(ctx, controllerSA, metav1.GetOptions{})
		if err != nil {
			return true, err
		}

		for _, s := range controllerSAObj.ImagePullSecrets {
			imagePullSecrets = append(imagePullSecrets, s.Name)
		}
		namespace = controllerNS
		serviceAccount = controllerSA
	}
	opt := k8schain.Options{
		Namespace:          namespace,
		ServiceAccountName: serviceAccount,
		ImagePullSecrets:   imagePullSecrets,
	}

	initContainerStatuses, statuses, err := c.resolver.Resolve(
		logger,
		rev,
		opt,
		cfgs.Deployment.RegistriesSkippingTagResolving,
		cfgs.Deployment.DigestResolutionTimeout,
	)
	if err != nil {
		c.resolver.Clear(types.NamespacedName{Namespace: rev.Namespace, Name: rev.Name})
		rev.Status.MarkContainerHealthyFalse(v1.ReasonContainerMissing, err.Error())
		return true, err
	}

	if len(statuses) > 0 || len(initContainerStatuses) > 0 {
		rev.Status.ContainerStatuses = statuses
		rev.Status.InitContainerStatuses = initContainerStatuses
		return true, nil
	}

	// No digest yet, wait for re-enqueue when resolution is done.
	return false, nil
}

akumar-47 avatar Oct 01 '25 09:10 akumar-47

@ak-hpe this isn't exclusive to Knative but is a Kubernetes problem. I know there exist tools to help import secrets into other namespaces and have them attach to the service account etc. You might want to take a look at https://github.com/carvel-dev/secretgen-controller for example.

I'm wary of your code snippet above since workloads have access to images they otherwise that they might not have access to. And you'll still need to attach a secret to the workload SA for it to run.

dprotaso avatar Oct 01 '25 20:10 dprotaso

Ah I see you mentioned

NOTE: At the host level, containerd is already configured with a secret for private registry authentication, which allows containerd itself to successfully pull images.

So your environment has 'high trust'

dprotaso avatar Oct 01 '25 20:10 dprotaso

So your environment has 'high trust'

@dprotaso yes, my environment is closer to a “high trust” multi-tenancy model:

  • Cluster is managed by a central platform admin.
  • Image registry credentials are org-wide and already configured at the containerd host level for pulling.
  • Namespace tenants should not need direct access to the secret; they just need workloads to run.

The proposal is about an optional mechanism (annotation/label/flag) so that operators can choose between:

  • Low trust mode (default): secrets per namespace (current behavior).
  • High trust mode (opt-in):
    • A global secret is created and managed by cluster operators under the Knative controller namespace (e.g. knative-serving).
    • Knative controllers can reference this secret for image tag → digest resolution.
    • End-users never get direct access to the secret itself; they only benefit indirectly because their workloads can resolve image tags without requiring the secret to be defined in the user namespace.

akumar-47 avatar Oct 02 '25 12:10 akumar-47

Hey @ak-hpe thanks for the additional information. Circling back to this I think this would be useful but I would probably change the implementation to not be controlled by the user who's declaring the workload.

I think it would be more secure if the operator of the knative installation could designate which namespaces/workloads have the ability to access this. I pinged the k8s folks to see if there was any prior art to this https://kubernetes.slack.com/archives/C0EN96KUY/p1759696089456739

dprotaso avatar Oct 05 '25 20:10 dprotaso

@dprotaso Any updates from the K8s community? I don’t have access to the discussion thread.

I was thinking the proposed flow could be:

  • Add a ConfigMap or setting in knative-serving (e.g. config-deployment) that lists namespaces allowed to use the global secret.
  • When the controller detects that a Revision belongs to one of the allowed namespaces, it uses the controller ServiceAccount and secrets from the knative-serving namespace for image tag → digest resolution.
  • If the namespace isn’t listed, the controller falls back to the existing behavior, resolving images using secrets defined in the workload’s own namespace.

Let me know if this approach sounds good — I can start working on a PR.

akumar-47 avatar Oct 13 '25 11:10 akumar-47