autoscaler icon indicating copy to clipboard operation
autoscaler copied to clipboard

VPA recommender doesn't recommend CPU requests below 10m

Open dmitrii-sisutech opened this issue 1 year ago • 5 comments

Which component are you using?: vertical-pod-autoscaler

What version of the component are you using?:

Component version: 1.0.0

What k8s version are you using (kubectl version)?:

kubectl version Output
$ kubectl version
Client Version: v1.28.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.27.7-gke.1056000

What environment is this in?: GKE

What did you expect to happen?: I expect Recommender to provide CPU recommendations to VPAs less than 10m when consumption is less than 10m.

What happened instead?: Instead the smallest recommended CPU is 10m while usage is 2-3m. On the screenshot there're 2 pods: image

How to reproduce it (as minimally and precisely as possible): If it's on GKE then if VPA is turned on turn it off in console. Then

./hack/vpa-down.sh
./hack/vpa-up.sh

Anything else we need to know?:

Recommender configuration:

        - name: recommender
          image: registry.k8s.io/autoscaling/vpa-recommender:1.0.0
          imagePullPolicy: Always
          args:
            - --v=4
            - --cpu-histogram-decay-half-life=0h2m0s
            - --history-length="3h"
            - --history-resolution="5m"
            - --pod-recommendation-min-cpu-millicores=1

I tried also --pod-recommendation-min-cpu-millicores=5 and --pod-recommendation-min-cpu-millicores=1.0 - didn't change anything. To make sure that parameter works at all tried --pod-recommendation-min-cpu-millicores=50 and it worked.

While we stayed with GKE VPA it was setting up CPU recommendations down to 1m.

dmitrii-sisutech avatar Jan 02 '24 14:01 dmitrii-sisutech

I think the problem is in that line. The smallest bucket size is 0.01 that's assigned to firstBucketSize parameter here. Later that parameter is used in determining percentile here.

I tried setting it to 0.001 and ran it in my cluster and it worked. The least it was assigning is 2m and max that I've tried is ~700m.

dmitrii-sisutech avatar Jan 05 '24 22:01 dmitrii-sisutech

Hey @dmitrii-sisutech, Your observation is correct, the recommender's histogram doesn't have the resolution for usages <10m. Thanks for the comparison with the GKE VPA, I wasn't aware of this difference!

I'm happy to mark this issue as a feature request. If you want to file a PR for this, I can take a look an review it!

/remove-label bug /kind feature

voelzmo avatar Jan 11 '24 08:01 voelzmo

@voelzmo: The label(s) /remove-label bug cannot be applied. These labels are supported: api-review, tide/merge-method-merge, tide/merge-method-rebase, tide/merge-method-squash, team/katacoda, refactor. Is this label configured under labels -> additional_labels or labels -> restricted_labels in plugin.yaml?

In response to this:

Hey @dmitrii-sisutech, Your observation is correct, the recommender's histogram doesn't have the resolution for usages <10m. Thanks for the comparison with the GKE VPA, I wasn't aware of this difference!

I'm happy to mark this issue as a feature request. If you want to file a PR for this, I can take a look an review it!

/remove-label bug /kind feature

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Jan 11 '24 08:01 k8s-ci-robot

/remove-kind bug

voelzmo avatar Jan 11 '24 08:01 voelzmo

/triage accepted

Shubham82 avatar Jan 12 '24 07:01 Shubham82

/area vertical-pod-autoscaler

adrianmoisey avatar Jul 08 '24 18:07 adrianmoisey