autoscaler icon indicating copy to clipboard operation
autoscaler copied to clipboard

VPA Initial mode does not set memory request according to recommended values

Open masterphenix opened this issue 2 years ago • 2 comments

Which component are you using?: vertical-pod-autoscaler

What version of the component are you using?: 0.10.0

What k8s version are you using (kubectl version)?:

kubectl version Output
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.1", GitCommit:"3ddd0f45aa91e2f30c70734b175631bec5b5825a", GitTreeState:"clean", BuildDate:"2022-05-24T12:26:19Z", GoVersion:"go1.18.2", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.4
Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.5", GitCommit:"8211ae4d6757c3fedc53cd740d163ef65287276a", GitTreeState:"clean", BuildDate:"2022-03-31T20:28:03Z", GoVersion:"go1.17.8", Compiler:"gc", Platform:"linux/amd64"}

What environment is this in?: AKS cluster on Azure

What did you expect to happen?:

I have a deployment with the following resources set in the container template (there is only one container):

            memory: 4Gi
            cpu: 60m
            memory: 600Mi

In the same namespace, there is a limitRange, with the following configuration:

    - default:
        cpu: "2"
        memory: 2Gi
        cpu: 60m
        memory: 32Mi
        cpu: "2"
        memory: 4Gi
      type: Container

I defined a VPA targetting that deployment, in Initial update mode, with the following policy:

    - containerName: '*'
      - cpu
      - memory
      controlledValues: RequestsOnly

Both recommender and admission-controller pods are running, and reporting no errors. The VPA recommends the following:

    - containerName: testpod
        cpu: 15m
        memory: "2538099186"
        cpu: 23m
        memory: "4743403291"
        cpu: 23m
        memory: "4743403291"
        cpu: 116m
        memory: "6248599546"

First issue here is that memory target is beyond the 4Gi maximum defined in the limitRange ; it should be 4294967296.

Then, when I delete pod, and a new pod pops up again, it's ressources are set like this:

        cpu: "2"
        memory: 4Gi
        cpu: 23m
        memory: 600Mi

CPU request is set correctly according to VPA target recommendation ; however, memory target seems to be capped to the request that is defined in the Deployment resource ; it should be set to the limitRange maximum, which is 4Gi, and not 600Mi.

What happened instead?:

Memory request value set in the pod does not match VPA recommendation (which does not take limitRange into account).

Anything else we need to know?:

Recommender config:

      - args:
        - --pod-recommendation-min-cpu-millicores=15
        - --pod-recommendation-min-memory-mb=30
        - --v=4

Also, the following annotations are applied on the pod: 'LimitRanger plugin set: cpu limit for container testpod'
    vpaObservedContainers: testpod
    vpaUpdates: 'Pod resources updated by testpod: container 0: memory capped
      to fit Max in container LimitRange, cpu request, memory request'

masterphenix avatar Jun 28 '22 15:06 masterphenix

First issue here is that memory target is beyond the 4Gi maximum defined in the limitRange ; it should be 4294967296.

VPA contains two places where capping happens:

Therefore, you won't see the capping happening in the VPA status in your scenario.

I'm not sure why memory is capped to 600Mi, though. The annotation memory capped to fit Max in container LimitRange seems to point in the right direction, not sure what's going on here.

voelzmo avatar Jul 15 '22 15:07 voelzmo

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Oct 13 '22 15:10 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Nov 12 '22 16:11 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-triage-robot avatar Dec 12 '22 16:12 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Dec 12 '22 16:12 k8s-ci-robot