kpt-config-sync icon indicating copy to clipboard operation
kpt-config-sync copied to clipboard

Do not reconcile unmanaged ResourceGroups

Open fsommar opened this issue 1 year ago • 1 comments

Hi 👋

We use Kpt and ResourceGroup inventories in addition to Config Sync. They serve different use cases for our clusters. We don't have a need for status to be populated in ResourceGroups and are therefore disabling it using overrides on each of our *Sync resources. ResourceGroups that are created outside of Config Sync, however, are reconciled by default, so we need to explicitly annotate each of them with configsync.gke.io/status: disabled. This includes inventories created by Kpt elsewhere, which ends up with a situation where Kpt inventories are reconciled by Config Sync, and that cannot be easily remedied since Kpt doesn't propagate locally defined annotations to the in-cluster inventory: https://github.com/kptdev/kpt/pull/4165.

Would it be reasonable to limit the scope of the ResourceGroup controller to only look at resources labeled with app.kubernetes.io/managed-by: configmanagement.gke.io? Alternatively, to provide a way to opt out of the ResourceGroup reconciler altogether?

fsommar avatar Jun 27 '24 10:06 fsommar

Would it be reasonable to limit the scope of the ResourceGroup controller to only look at resources labeled with app.kubernetes.io/managed-by: configmanagement.gke.io?

Yes I think this would be reasonable

Alternatively, to provide a way to opt out of the ResourceGroup reconciler altogether?

The scope of this change would be quite a bit larger, so I think this would be harder to approve. I expect we would want more user feedback before exploring this approach.

sdowell avatar Jun 28 '24 17:06 sdowell

@fsommar @sdowell What's the status here? Is this already implemented, or is it an up-for-grabs feature to take a stab at? (I've started thinking a little about tech stuff as I mentally ramp up for returning to work next week after almost three months of leave...)

tomasaschan avatar Sep 25 '24 15:09 tomasaschan

Was the actual problem here that the statuses were being updated or just that it was removing your custom annotations?

Seems like we could have preserved your annotations with server-side apply, without disabling the status updates. Just because you're not using the statuses doesn't mean nobody is...

karlkfi avatar Sep 27 '24 20:09 karlkfi

A bit of both, I'd say; we don't need the status updates and want to avoid the additional load on the API server from them, and we were also missing annotations we needed to retain.

I would also argue that the expected (or least surprising) behavior for using ResourceGroups alongside but independently of Config Sync is that the latter ignores the former, since ResourceGroup is a resource type not strictly coupled to Config Sync. In other words, why should I get different behavior on them when they're created by something else (like kpt apply) if I also happen to use Config Sync, compared to if I don't?

If there's a use case for having status updates on ResourceGroups not owned by Config Sync we could roll forward to also allow some other label or annotation to opt them in again.

tomasaschan avatar Sep 28 '24 21:09 tomasaschan

FWIW, I don't think this is a big deal, but it is technically a breaking change, which we try to avoid. That said, it's only breaking for people who use both kpt and Config Sync, and only if they actually rely on the ReaourceGroup status being updated continuously.

The resource-group-controller (RGC) was originally designed to be separate from Config Sync, so it could also be used with kpt to aggregate the status of managed objects. This is used by "nomos status", but that only works if there's also a RootSync/RepoSync.

AFAIK, there aren't actually any tools or users that I know of that take advantage of this other than Config Sync. So it's hard to say if the change will really impact anyone at all, other than Spotify.

The primary benefit is that you don't have to do a ton of GETs and LISTs to determine current package status. kpt doesn't currently take advantage of this. Hypothetically it could, tho.

karlkfi avatar Sep 29 '24 05:09 karlkfi