Refactor Gardener components to native controller-runtime components
How to categorize this issue?
/area open-source dev-productivity scalability /kind cleanup technical-debt
What would you like to be added:
-
gardener-scheduler,gardener-controller-managerandgardenletshould be refactored such that they become native controller-runtime components. -
gardener-resource-manager,gardener-admission-controller,gardener-seed-admission-controllerare already native controller-runtime components. -
gardener-apiserveris special anyways because it's neither a controller nor a webhook server.
Why is this needed:
- Less code
- Less duplication/hand-crafted logic
- Easier maintenance
- Metrics out-of-the-box
- Streamlined logging
Steps
- all controllers should implement the
reconcile.Reconcilerinterface- [x] #4503
- [x] https://github.com/gardener/gardener/pull/4891
- remove current dummy controller metrics
- [x] ... in gardener-scheduler: https://github.com/gardener/gardener/pull/4320
- [x] ... in gardener-controller-manager, gardenlet: https://github.com/gardener/gardener/pull/4913
- [x] check whether we can remove active worker counting in our controllers (dummy graceful termination)
- streamline logging – replace
logruswithlogr- [x] https://github.com/gardener/gardener/pull/4925
- [x] https://github.com/gardener/gardener/pull/5175
- [x] add
logchecktool to ensure even number of args forInfo,ErrorandWithValuescalls, etc.: https://github.com/gardener/gardener/pull/5442 - scheduler
- [x] https://github.com/gardener/gardener/pull/4320
- controller-manager
- [x] inject
logrlogger into reconcile context to make it retrievable bylogf.FromContext(ctx): https://github.com/gardener/gardener/pull/5057 - [x] https://github.com/gardener/gardener/pull/5268
- [x] https://github.com/gardener/gardener/pull/5290
- [x] https://github.com/gardener/gardener/pull/5488
- [x] https://github.com/gardener/gardener/pull/6253
- [x] inject
- gardenlet
- [x] https://github.com/gardener/gardener/pull/6259
- [x] https://github.com/gardener/gardener/pull/6260
- [x] https://github.com/gardener/gardener/pull/6264
- [x] https://github.com/gardener/gardener/pull/6265
- [x] https://github.com/gardener/gardener/pull/6266
- [x] https://github.com/gardener/gardener/pull/6267
- [x] https://github.com/gardener/gardener/pull/6278
- [x] https://github.com/gardener/gardener/pull/6282
-
./test/...packages- [x] https://github.com/gardener/gardener/pull/6316
- [ ] https://github.com/gardener/gardener/issues/5191
- [x] https://github.com/gardener/gardener/pull/6332
- [x] https://github.com/gardener/gardener/pull/6380
- introduce c-r Manager, including leader election, client/cache creation, enhance healthz/readyz endpoints to include checks for informer/watch healthiness
- controller-manager
- [x] https://github.com/gardener/gardener/pull/6333
- scheduler
- [x] https://github.com/gardener/gardener/pull/4320
- gardenlet
- [x] https://github.com/gardener/gardener/pull/6688
- controller-manager
- add native c-r controllers to manager including predicates, event handlers, etc. as needed (can be done step by step):
⚠️ use
predicate.GenerationChangedPredicateinstead of comparingmetadata.generationandstatus.observedGenerationand stop returningnilafter adding finalizers in order to prevent duplicated reconciliations (basically implement option b) from https://github.com/gardener/gardener/issues/4730 where it is not done yet)- controller-manager
- [x] @rfranzke: https://github.com/gardener/gardener/pull/6358
- [x] @ary1992: https://github.com/gardener/gardener/pull/6615
- [x] @timebertt: https://github.com/gardener/gardener/pull/6333
- [x] @rfranzke: https://github.com/gardener/gardener/pull/6360
- [x] @ary1992: https://github.com/gardener/gardener/pull/6462
- [x] @rfranzke: https://github.com/gardener/gardener/pull/6629
- [x] @ary1992: https://github.com/gardener/gardener/pull/6550
- [x] @ary1992: https://github.com/gardener/gardener/pull/6423
- [ ] @ary1992: Switch
ManagedSeedSetcontroller tocontroller-runtime - [x] @acumino: https://github.com/gardener/gardener/issues/6348
- [x] @timebertt: https://github.com/gardener/gardener/pull/6645
- [x] @rfranzke: https://github.com/gardener/gardener/pull/6667
- [x] @ary1992: https://github.com/gardener/gardener/pull/6391
- [x] @rfranzke: https://github.com/gardener/gardener/pull/6621
- [x] @rfranzke: https://github.com/gardener/gardener/pull/6636
- [x] @rfranzke: https://github.com/gardener/gardener/pull/6620
- scheduler
- [x] https://github.com/gardener/gardener/pull/4320
- gardenlet (introduce filtered cache with
cache.Options.SelectorsByObjectfor watched objects)- [ ] @ary1992: Switch
BackupBucketcontroller tocontroller-runtime - [ ] @ary1992: Switch
BackupEntrycontroller tocontroller-runtime - [ ] @ary1992: Switch
Bastioncontroller tocontroller-runtime - [ ] @rfranzke: https://github.com/gardener/gardener/pull/6733
- [ ] @rfranzke: Switch
Extensionscontroller tocontroller-runtime- replace custom retry logic on seed bootstrap for waiting until we get informers for our CRs
- [ ] @rfranzke: Switch
ManagedSeedcontroller tocontroller-runtime - [ ] @ary1992: Switch
NetworkPolicycontroller tocontroller-runtime - [ ] @rfranzke: Switch
Seedcontroller tocontroller-runtime - [ ] @rfranzke: Switch
Shootcontroller tocontroller-runtime - [ ] @ary1992: Switch
ShootSecretcontroller tocontroller-runtime
- [ ] @ary1992: Switch
- controller-manager
- expose controller-runtime metrics (related to #2815)
- [x] https://github.com/gardener/gardener-extension-provider-aws/pull/428 (gives a small preview at what metrics we can get from that)
- [x] ... in gardener-scheduler: https://github.com/gardener/gardener/pull/4320
- [x] ... in gardener-controller-manager: https://github.com/gardener/gardener/pull/6333
- [x] ... in gardenlet: https://github.com/gardener/gardener/pull/6688
- [ ] adapt seed-prometheus scrape config to allow scraping with
httpscheme: https://github.com/gardener/gardener/pull/4706#discussion_r713648371 - [ ] add grafana dashboards to seed
- [ ] https://github.com/gardener/gardener/issues/2829
- [x] https://github.com/gardener/gardener/issues/4605
- [ ] add timeout for initial cache syncs in controller/component constructors (similar to https://github.com/gardener/gardener-resource-manager/pull/102) or make sure that c-r does this out of the box
All controllers should implement the
reconcile.Reconcilerinterface (already done for GSCHED, GCM)
I'm currently working on this for the remaining gardenlet controllers as part of https://github.com/gardener/gardener/issues/2822
All controllers should implement the
reconcile.Reconcilerinterface (already done for GSCHED, GCM)I'm currently working on this for the remaining gardenlet controllers as part of #2822
It's mostly done with https://github.com/gardener/gardener/pull/4503.
Though, the shoot controller should be further refactored into individual reconcile.Reconcilers instead of using reconcile.Funcs.
/assign @timebertt /roadmap internal /milestone 2021-Q4
/in-progress
refactor remaining controllers to
reconcile.Reconcilerinterface (extensions controller: controllerinstallation/shootstate)
I tried to accomplish this, but I faced a few difficulties with the current implementation/architecture of the package:
- the controller actually contains 2 controllers: a controllerinstallation controller and a shootstate controller, but both controllers are initialized and built from the shared
artifactsmap - the controllerinstallation reconcilers share a common state (
kindToRequiredTypes) which is not so easy to do if they are split into dedicated structs
Given, that this controller package anyways needs significant restructuring work (actually it almost needs to be rebuilt from scratch) in order to make it work with native c-r constructs, I decided to not refactor this right now to reconcile.Reconciler. This provides only little to no value right now over reconcile.Func, as it cannot be simply added as a native controller to a c-r manager, so it needs to be touched again anyways.
Once we are able to leverage all the c-r goodies to get rid of all our custom coding, we can rebuild it in a clean way.
So this will be done in a dedicated PR as part of step
add native c-r controllers to manager including predicates, event handlers, etc. as needed (can be done step by step)
check whether we can remove active worker counting in our controllers (dummy graceful termination)
I look into this and played around with it a little bit. The result:
- current implementation doesn't prevent the process from exiting, if items are left in the queue or workers have not finished yet
- too much effort to fix current implementation
- could remove it now in favor of graceful termination feature, that we will get with native c-r controllers
- but it also doesn't hurt, so let's keep it for now and simply remove it together in individual refactorings of controllers
Hence, I marked the step as completed for now. cc @rfranzke
/assign @rfranzke @timebertt
/assign @ary1992 for some of the controller refactorings for controller-manager, see above
The Gardener project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale - Mark this issue or PR as rotten with
/lifecycle rotten - Close this issue or PR with
/close
/lifecycle stale
/remove-lifecycle stale
Now that #7147 is merged, we can close this issue, correct? :)
Yes 🚀 /close
@shafeeqes: Closing this issue.
In response to this:
Yes 🚀 /close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.