autoscaler icon indicating copy to clipboard operation
autoscaler copied to clipboard

VPA applies empty initial recommendations

Open dbazhal opened this issue 1 year ago • 1 comments

Which component are you using?:

vertical-pod-autoscaler

What version of the component are you using?:

Component version: 0.13.0

What k8s version are you using (kubectl version)?:

kubectl version Output
$ kubectl version
ARNING: This version information is deprecated and will be replaced with the output from kubectl version --short.  Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.3", GitCommit:"25b4e43193bcda6c7328a6d147b1fb73a33f1598", GitTreeState:"clean", BuildDate:"2023-06-14T09:53:42Z", GoVersion:"go1.20.5", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v5.0.1
Server Version: version.Info{Major:"1", Minor:"25+", GitVersion:"v1.25.15-eks-4f4795d", GitCommit:"9587e521d190ecb7ce201993ceea41955ed4a556", GitTreeState:"clean", BuildDate:"2023-10-20T23:22:38Z", GoVersion:"go1.20.10", Compiler:"gc", Platform:"linux/amd64"}
WARNING: version difference between client (1.27) and server (1.25) exceeds the supported minor version skew of +/-1

What environment is this in?:

eks

What did you expect to happen?:

When I deploy new version of my app Deployment, I expect new pods receive resources set to recommendations by the admission controller.

What happened instead?:

Pods from new rollout (ReplicaSet) of the same deployment get empty recommendations, and containers requests are just set to limit values.

How to reproduce it (as minimally and precisely as possible):

You have the deployment, 2 replicas, requests not set, limits are set on all containers PDB is maxUnavailable 1. Vpa is set on targetRef Deployment, in mode Initial, RequestsOnly. Vpa object contains fresh recommendations for each container in status.reccomendations.containerRecommendations. Deployment is running and vpaObservedContainers and vpaUpdates annotations are set and applied to existing pods. You patch the deployment with new spec. New replicaset is created, and it's pods are created with

2023-12-13 13:20:22.466	
I1213 12:20:22.466025       1 matcher.go:68] Let's choose from 1 configs for pod mytestapp-testing/mytestapp-testing-6f5954546d-%

2023-12-13 13:20:22.466	
I1213 12:20:22.466036       1 recommendation_provider.go:90] updating requirements for pod mytestapp-testing-6f5954546d-%.

2023-12-13 13:20:22.466	
I1213 12:20:22.466120       1 server.go:112] Sending patches: [{add /metadata/annotations/vpaUpdates Pod resources updated by mytestapp-testing: container 0: ; container 1: ; container 2: ; container 3: ; container 4: ; container 5: ; container 6: ; container 7: ; container 8: ; container 9: } {add /metadata/annotations/vpaObservedContainers istio-proxy, mytestapp, mytestapp-api, mytestapp-crons, mytestapp-integration, mytestapp-jobs, syslog-ng-sidecar, promtail-sidecar, solon-adapter-sidecar, nginx-sidecar}]	

and vpaObservedContainers contains correct list of containers, but vpaUpdates shows that vpa applied

Anything else we need to know?:

I'll be happy to give any additional information to track it down :)

dbazhal avatar Dec 13 '23 18:12 dbazhal

Hey @dbazhal, thanks for the detailed description!

I think what's happening here is expected, given how you configured things. If you don't specify requests, but only limits, k8s will set your Pod's requests to the same value as your configured limits: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#requests-and-limits This happens regardless of using VPA or not.

In your scenario, you have set VPA to RequestsOnly, which means it will leave the existing limits in place. As requests cannot be higher than limits, the VPA would only ever scale your Deployment down, it cannot scale it up.

As an experiment, I modified the example hamster app like this

---
apiVersion: "autoscaling.k8s.io/v1"
kind: VerticalPodAutoscaler
metadata:
  name: hamster-vpa
spec:
  # recommenders field can be unset when using the default recommender.
  # When using an alternative recommender, the alternative recommender's name
  # can be specified as the following in a list.
  # recommenders: 
  #   - name: 'alternative'
  targetRef:
    apiVersion: "apps/v1"
    kind: Deployment
    name: hamster
  updatePolicy:
    updateMode: "Initial"
  resourcePolicy:
    containerPolicies:
      - containerName: '*'
        minAllowed:
          cpu: 100m
          memory: 50Mi
        maxAllowed:
          cpu: 2
          memory: 500Mi
        controlledResources: ["cpu", "memory"]
        controlledValues: "RequestsOnly"
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: hamster
spec:
  selector:
    matchLabels:
      app: hamster
  replicas: 2
  template:
    metadata:
      labels:
        app: hamster
    spec:
      securityContext:
        runAsNonRoot: true
        runAsUser: 65534 # nobody
      containers:
        - name: hamster
          image: registry.k8s.io/ubuntu-slim:0.1
          resources:
            limits:
              cpu: 2000m
              memory: 500Mi
          command: ["/bin/sh"]
          args:
            - "-c"
            - "while true; do sleep 120s; done"

and waited until the VPA object had a recommendation present. Then, I updated the Deployment, applied it, and the newly created Pods had much lower requests.

voelzmo avatar Jan 11 '24 11:01 voelzmo

Hi @voelzmo ! Thank you for your reply and willing to help :) I tried your suggestion and have set tiny requests for my pods. But still on rollout I'm getting

I0124 08:53:16.978806       1 matcher.go:68] Let's choose from 1 configs for pod mytestapp-testing/mytestapp-testing-5758bcd678-%
I0124 08:53:16.978818       1 recommendation_provider.go:90] updating requirements for pod mytestapp-testing-5758bcd678-%.
I0124 08:53:16.978834       1 recommendation_provider.go:57] no matching recommendation found for container mytestapp, skipping
I0124 08:53:16.978874       1 server.go:112] Sending patches: [{add /metadata/annotations/vpaUpdates Pod resources updated by mytestapp-testing: container 0: ; container 1: ; container 2: ; container 3: ; container 4: } {add /metadata/annotations/vpaObservedContainers istio-proxy, mytestapp, syslog-ng-sidecar, solomon-adapter-sidecar, tvmtool-sidecar}]

I understand that it can be out of the scope of this ticket, but maybe you have some ideas why can that be? It looks like during rollout, vpa can't match pods of a new replicaset with recommendations for the previous deployment rollout.

dbazhal avatar Jan 24 '24 09:01 dbazhal

I've set requests to 10m and 10Ki, changed vpa mode from Initial to Auto, but I still get "no matching recommendation found" on rollouts with some changes to pod templates. I was suspecting that PodState.labelSetKey plays some role, because we had "version" label changing every rollout. But I removed it, and the problem still exists. I also noticed pods have this label pod-template-hash, and it changes on any new rollout with new image or something like that. Can that have such impact on matching pod recommendations? Otherwise, what can lead to missing recommendations on fresh new rollouts?

dbazhal avatar Feb 08 '24 19:02 dbazhal

Hey @dbazhal given that the whole process works with the example hamster app which ships with the VPA, it is likely that you're experiencing some sort of configuration issue. Can you post your VPA object (ideally in the current state where it has a status section with recommendations in it) and the Deployment that the VPA is referencing?

voelzmo avatar Feb 12 '24 09:02 voelzmo

Hey @dbazhal given that the whole process works with the example hamster app which ships with the VPA, it is likely that you're experiencing some sort of configuration issue. Can you post your VPA object (ideally in the current state where it has a status section with recommendations in it) and the Deployment that the VPA is referencing?

Oh, thank you for the advice! Double-checked objects on rollout, and it turned out that vpa says "no matching recommendation found" when recommendation is less than minimal allowed.

dbazhal avatar Feb 13 '24 14:02 dbazhal

Great to hear you found the reason and thanks for reporting back!

voelzmo avatar Feb 14 '24 14:02 voelzmo

/remove-kind bug /kind support

voelzmo avatar Feb 14 '24 14:02 voelzmo