controller-runtime icon indicating copy to clipboard operation
controller-runtime copied to clipboard

Client `resourceMeta` cache prevents CRD scope changes in dynamic environments

Open u-kai opened this issue 3 months ago • 4 comments

Problem

We encountered an issue where controller-runtime clients fail to handle CRD scope changes when the same GroupVersionKind is reused. This affects dynamic environments where CRDs are managed programmatically.

Example Scenario:

While using Crossplane XRDs, we experienced this flow:

  1. Create XRD with scope: Cluster
  2. Use controller-runtime client to manage resources (works fine)
  3. Delete the XRD
  4. Recreate the same XRD with scope: Namespaced
  5. Unstructured client Update() operations fail because the client still makes cluster-scoped API calls

Root Cause

The issue lies in pkg/client/client_rest_resources.go where resourceMeta is cached indefinitely by GVK and the DynamicRESTMapper only refreshes on unknown GVKs, so scope changes for existing GVKs are never picked up.

If my understanding is incorrect, or if there’s a better way to handle this, I’d really appreciate your guidance. 🙏

Current Workaround

Restart the controller or create a new client instance - impractical for long-running controllers.


I understand this addresses an edge case that goes against Kubernetes' immutable infrastructure principles. CRD scope changes should ideally use proper versioning (v1alpha1 → v1alpha2).

However, dynamic environments and development workflows could benefit from either:

  • A technical solution for better cache consistency
  • Clear documentation of this limitation with best practices

I'm happy to contribute a PR if maintainers agree on the approach, or help improve documentation if that's the preferred solution.

u-kai avatar Aug 30 '25 07:08 u-kai

I do think this is quite an edge case, if we can find a way to make it work that is not too invasive/complicated that would be fine though.

What happens to watches if the scoping changes?

alvaroaleman avatar Aug 30 '25 18:08 alvaroaleman

Thanks for raising the question about watches!

Cluster → Namespaced: all-namespaces watches may continue working since the collection URL shape is the same, but namespace-specific CRUD or watch calls would fail because they now require the /namespaces/ segment.

Namespaced → Cluster: any namespace-scoped watch would fail, since cluster-scoped resources don’t support namespaced paths.

Informers also inherit the wrong scope once created, but unlike clients they can be torn down and recreated (e.g. via RemoveInformer()), which makes them somewhat easier to control.

u-kai avatar Aug 31 '25 02:08 u-kai

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Nov 29 '25 02:11 k8s-triage-robot

/remove-lifecycle stale

negz avatar Dec 15 '25 22:12 negz