goldilocks
goldilocks copied to clipboard
VPA TargetRef created can be considered invalid by K8s Autoscaler Recommender
What happened?
Goldilocks controller walks a chain of owner references and uses the TopController as the Target Reference in the VPA it creates.
The logic in k8s vertical-pod-autoscaler recommender requires the target ref to pass a "isWellKnownOrScalable" test.
There are circumstances where the VPAs created by the Goldilocks controller do not pass that test. One example is via PrometheusOperator, an Alertmanager resource creating a Statefulset and then Pods. The target reference specified by Goldilocks is monitoring.coreos.com/v1
Alertmanager
prom-operator-alertmanager
. This passes neither the Wellknown nor the Scalable tests and so receives no recommendations and shows as empty in the Goldilocks UI.
If i manually adjust the target reference configuration to be the Alertmanager child stateful set, the VPA receives recommendations (target reference is well known).
There should be additional logic in determining the TopController that considers the acceptance criteria specified by the K8s VPA Recommender.
Perhaps a check for Scalable can be incorporated and a default fall back to "well known" implemented ?
Otherwise a flag to prefer well known would go a long way to covering usage patterns that are currently not handled in a way that works with the Recommender.
What did you expect to happen?
I expected a conformant VPA to be created
How can we reproduce this?
Use CRDs which are not classified as "scalable" by the VPA Recommender - for example Prometheus Operator Alertmanager configuration.
Version
master
Search
- [X] I did search for other open and closed issues before opening this.
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
Additional context
No response
Thank you for the detailed writeup! This actually solves some issues that have been rattling around in my head.
Currently, we're using https://github.com/fairwindsops/controller-utils to identify the top level controller, so I don't know if we should add this functionality there, or here.
Thank you for the detailed writeup! This actually solves some issues that have been rattling around in my head.
Currently, we're using https://github.com/fairwindsops/controller-utils to identify the top level controller, so I don't know if we should add this functionality there, or here.
I think that GetAllTopControllers returns a list of configs that do not include the chain - and so I think either the behaviour of GetAllTopControllers would need to change or an alternative method like GetAllVpaCompatibleTopControllers could be implemented. I am happy to prototype a fix in controller-utils if there is a chance it will be accepted
Pinging @ivanfetch @rbren for comment on controller-utils.
We discussed a bit internally and I think it makes sense to add it there, but we want to do it in a way that's a bit more flexible, perhaps accepting a function to modify the behavior.
Yeah exactly - I think adding some options to controller-utils is the way to go
Is there something planned about this change? It is affecting Goldilocks created VPAs as per https://github.com/FairwindsOps/goldilocks/issues/621