cri-resource-manager
cri-resource-manager copied to clipboard
[draft/WIP] Install as Gardener extension
Install CRI-RM as Gardener extension using ControllerDeployment/ControllerRegistation as generic Extension of type "cri-resource-manager"
TODO:
I. PoC/research phase
- [x] 1. P ~No operator, just using ManagedResource with shoot namespace is hardcoded so it won't install automatically on every new shoot~ (replaced by proper operator)
- [x] 2. M Replace example configmaps with "installation daemonset" similar to how gvisor is installed here
- [x] 3. P Fully fledged operator based on gardener/extension pkg framework (actuator with migrate/hibernate + operation annotation + status management + finalizers),
- [x] 4. P Prepare "extension" image + example controller registration/deployment + automatic generation of those
- [ ] 4a. may require to add support for GardenLinux (to have a dockerfile to build cri-rm for gardenlinux)
- [x] 5. M Replace downloading Debian package with "installation" image
- [x] 6. M Make installation more reliable
- [x] distro agnostic - installation should use static binary + own systemd unit files (so far, exiting debian package and binaries assume e.g. GCC_2_23 version available)
- [x] check services statuses before restarting kubelet
- [ ] 7. M Uninstallation flow
- [ ] 7a. When shoot is deleted
- [ ] 7b. (low priority) when extension is disabled/removed from podspec
Bugs:
- [ ] Race condition when restarting kubelet can causes to cluster become irresponsible
- [ ] If resync is set to low value - there is race between reconciliation and delete handlers - my object is both deleted/created and the same time - and deleting of shoot is stuck
II. Quality
- [x] 7. M Health check for operator
- [ ] 8. M Advanced Self-healing installation: health check e.g. use static pod as health check and undo installation if something fails
- [x] 9. P/M Proper project structure: Makefile, cmd/pkg, hacks/examples, licenses (adhere to convention of other gardener extensions)
- [ ] 10. P/M Unit tests
- [ ] 11. M Integration tests - with envtest NewShootFramwork cluster
- [ ] 12. P E2E tests + workload e.g. cloud setup on AWS
III. Advanced/Extra features
- [ ] 13. P Image vector support (overriding extension and installation image)
- [ ] 14. M Support for cri-resource-manager advanced features:
- [ ] Annotation webhook - to pass pod spec resource information to CRI-RM
- [ ] Node agent - to support dynamic policies
Check the README.md for further details (how to actually run this within kind local cluster.
Codecov Report
Merging #760 (8606110) into master (0cb0ad5) will increase coverage by
0.00%
. The diff coverage isn/a
.
:exclamation: Current head 8606110 differs from pull request most recent head 237bb70. Consider uploading reports for the commit 237bb70 to get more accurate results
@@ Coverage Diff @@
## master #760 +/- ##
=======================================
Coverage 37.51% 37.52%
=======================================
Files 55 55
Lines 8108 8105 -3
=======================================
- Hits 3042 3041 -1
+ Misses 4771 4770 -1
+ Partials 295 294 -1
Impacted Files | Coverage Δ | |
---|---|---|
...rce-manager/policy/builtin/topology-aware/pools.go | 62.05% <0.00%> (+0.15%) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 0cb0ad5...237bb70. Read the comment docs.
Gardener extension was moved to external repo (with proper project structure/testing and so on) here: https://github.com/intel/gardener-extension-cri-resmgr
Gardener extension was moved to external repo (with proper project structure/testing and so on) here: https://github.com/intel/gardener-extension-cri-resmgr