autoscaler
autoscaler copied to clipboard
VPA Initial mode does not set memory request according to recommended values
Which component are you using?: vertical-pod-autoscaler
What version of the component are you using?: 0.10.0
What k8s version are you using (kubectl version
)?:
kubectl version
Output
$ kubectl version Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.1", GitCommit:"3ddd0f45aa91e2f30c70734b175631bec5b5825a", GitTreeState:"clean", BuildDate:"2022-05-24T12:26:19Z", GoVersion:"go1.18.2", Compiler:"gc", Platform:"linux/amd64"} Kustomize Version: v4.5.4 Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.5", GitCommit:"8211ae4d6757c3fedc53cd740d163ef65287276a", GitTreeState:"clean", BuildDate:"2022-03-31T20:28:03Z", GoVersion:"go1.17.8", Compiler:"gc", Platform:"linux/amd64"}
What environment is this in?: AKS cluster on Azure
What did you expect to happen?:
I have a deployment with the following resources set in the container template (there is only one container):
resources:
limits:
memory: 4Gi
requests:
cpu: 60m
memory: 600Mi
In the same namespace, there is a limitRange, with the following configuration:
spec:
limits:
- default:
cpu: "2"
memory: 2Gi
defaultRequest:
cpu: 60m
memory: 32Mi
max:
cpu: "2"
memory: 4Gi
type: Container
I defined a VPA targetting that deployment, in Initial update mode, with the following policy:
resourcePolicy:
containerPolicies:
- containerName: '*'
controlledResources:
- cpu
- memory
controlledValues: RequestsOnly
Both recommender and admission-controller pods are running, and reporting no errors. The VPA recommends the following:
recommendation:
containerRecommendations:
- containerName: testpod
lowerBound:
cpu: 15m
memory: "2538099186"
target:
cpu: 23m
memory: "4743403291"
uncappedTarget:
cpu: 23m
memory: "4743403291"
upperBound:
cpu: 116m
memory: "6248599546"
First issue here is that memory target is beyond the 4Gi maximum defined in the limitRange ; it should be 4294967296.
Then, when I delete pod, and a new pod pops up again, it's ressources are set like this:
resources:
limits:
cpu: "2"
memory: 4Gi
requests:
cpu: 23m
memory: 600Mi
CPU request is set correctly according to VPA target recommendation ; however, memory target seems to be capped to the request that is defined in the Deployment resource ; it should be set to the limitRange maximum, which is 4Gi, and not 600Mi.
What happened instead?:
Memory request value set in the pod does not match VPA recommendation (which does not take limitRange into account).
Anything else we need to know?:
Recommender config:
- args:
- --pod-recommendation-min-cpu-millicores=15
- --pod-recommendation-min-memory-mb=30
- --v=4
image: k8s.gcr.io/autoscaling/vpa-recommender:0.10.0
Also, the following annotations are applied on the pod:
kubernetes.io/limit-ranger: 'LimitRanger plugin set: cpu limit for container testpod'
vpaObservedContainers: testpod
vpaUpdates: 'Pod resources updated by testpod: container 0: memory capped
to fit Max in container LimitRange, cpu request, memory request'
First issue here is that memory target is beyond the 4Gi maximum defined in the limitRange ; it should be 4294967296.
VPA contains two places where capping happens:
-
The Recommender caps according to
minAllowed
andmaxAllowed
. If this happens, you will see a different value inuncappedTarget
. IfmaxAllowed
is not defined in the VPA object, the Recommender will not cap at this point in time. -
The admission-controller is responsible for capping to the existing limit in case
controlledValues: RequestsOnly
is set or to an existingLimitRange
in the namespace
Therefore, you won't see the capping happening in the VPA status in your scenario.
I'm not sure why memory is capped to 600Mi, though. The annotation memory capped to fit Max in container LimitRange
seems to point in the right direction, not sure what's going on here.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale
- Mark this issue or PR as rotten with
/lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied- After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied- After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closedYou can:
- Reopen this issue with
/reopen
- Mark this issue as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.