autoscaler VPA: deleting VPA and recreate it, still gives some recommendation

Which component are you using?: vertical-pod-autoscaler

What version of the component are you using?: Component version: 0.10.0

What k8s version are you using (kubectl version)?:

kubectl version Output

$ kubectl version Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.5", GitCommit:"e338cf2c6d297aa603b50ad3a301f761b4173aa6", GitTreeState:"clean", BuildDate:"2020-12-09T11:18:51Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.5", GitCommit:"e338cf2c6d297aa603b50ad3a301f761b4173aa6", GitTreeState:"clean", BuildDate:"2020-12-09T11:10:32Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"linux/amd64"}

What environment is this in?:

What did you expect to happen?: We are using VPA to make recommendation for our pods. We just start the vpa as it is written. Only difference was that we added the the pod-recommendation-min-memory-mb=10 flag for the recommender deployment. We ran some test with 100TPS for an hour and we got some recommendation. Until this step everything was fine. Let's say we got 500m CPU recommendation

Then after one day no test, the recommandation was down, let's say to 130mCPU. I think this is still good, since the recommender uses the data what it is collected from the start.

Now we removed the VPA totally with the vpa-down.sh script. then I created again and also created the VPA resource. When I checked the recommendation it still showed the 130mCPU, while I thought, since I stop the VPA totally, that it will be nothing or only the minimum value. so it seems that somewhere, somebody is storing some data.

What happened instead?:

Recommendation did not disapperared after VPA uninstall/reinstall

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

As far as I know the default vpa-up.sh does not setup the Promethous history data collection. I did not set it either, so I do not know why this porblem happened.

Can you help me what are we doing wrongly? Thanks Akos

Feb 15 '22 20:02 Pisztrang

Please add more information. Preferably step-by-step reproduction instructions and explain what happened vs what you expected

Feb 23 '22 13:02 jbartosik

Hi

What I did earlier is the following:

install VPA as describer. Only difference was that the minimum memory was set to 10Mbyte

kubectl -n kube-system describe deployments.apps vpa-recommender 
Name:                   vpa-recommender
Namespace:              kube-system
CreationTimestamp:      Fri, 04 Mar 2022 13:25:57 +0000
Labels:                 <none>
Annotations:            deployment.kubernetes.io/revision: 1
Selector:               app=vpa-recommender
Replicas:               1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType:           RollingUpdate
MinReadySeconds:        0
RollingUpdateStrategy:  25% max unavailable, 25% max surge
Pod Template:
  Labels:           app=vpa-recommender
  Service Account:  vpa-recommender
  Containers:
   recommender:
    Image:      k8s.gcr.io/autoscaling/vpa-recommender:0.9.2
    Port:       8942/TCP
    Host Port:  0/TCP
    Args:
      --pod-recommendation-min-memory-mb=10
    Limits:
      cpu:     200m
      memory:  1000Mi
    Requests:
      cpu:        50m
      memory:     500Mi
    Environment:  <none>
    Mounts:       <none>
  Volumes:        <none>
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Available      True    MinimumReplicasAvailable
  Progressing    True    NewReplicaSetAvailable
OldReplicaSets:  <none>
NewReplicaSet:   vpa-recommender-5c5bbcb6cf (1/1 replicas created)
Events:          <none>

Start the VPA for my application

apiVersion: "autoscaling.k8s.io/v1"
kind: VerticalPodAutoscaler
metadata:
  name: parameterprovision
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind: Deployment
    name: parameterprovision
  updatePolicy:
    updateMode: "Off"

I ran some performance test with 100tps. Then I can see that the reccomendation is showing some value, which seems to be OK.
then I stopped the test case and removed the VPA from k8s (vpa-down.sh).
Now giving the vpa-up.sh again and create the same vpa. Giving the vpa describer it shows the same values as it was earlier, but currently there is no load on it. so it seems that it remembers some earler data.

actually it became more weired today....after the weekend I checked the recommendation again and also checked the top pods: I did not run anything in the last two days on the cluster.

kubectl top pods| grep prov
parameterprovision-6b8988c8fd-5m6q4         3m           27Mi

while the vpa describer shows this:

  Recommendation:
    Container Recommendations:
      Container Name:  nef-parameterprovision
      Lower Bound:
        Cpu:     25m
        Memory:  49530131
      Target:
        Cpu:     25m
        Memory:  109814751
      Uncapped Target:
        Cpu:     25m
        Memory:  109814751
      Upper Bound:
        Cpu:     698m
        Memory:  150053070

The targer memory is 105Mbytes, but there is no load for two days in this pod.

And Now I deleted VPA, scale my deployement to 0, then scale it to 1 again, create VPA and the result is the following:

nef-parameterprovision-6b8988c8fd-5t46m         2m           20Mi

  Recommendation:
    Container Recommendations:
      Container Name:  nef-parameterprovision
      Lower Bound:
        Cpu:     25m
        Memory:  33289666
      Target:
        Cpu:     25m
        Memory:  36253748
      Uncapped Target:
        Cpu:     25m
        Memory:  36253748
      Upper Bound:
        Cpu:     490m
        Memory:  1615842163

I see a quite big difference between the top and the recommendation.

so what is the right way to "reset" the vpa and start a new measurement. Actually we just want to get a recommendation when the load is 100 tps, but now for me it seems that I cannot reproduce twice the same result....

Is there somewhere explained how the recommendation is calculated? Thanks

Mar 07 '22 07:03 Pisztrang

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Jun 05 '22 07:06 k8s-triage-robot

/remove-lifecycle stale

I didn't have time to look into the problem but I think it would be good to do that.

Jun 14 '22 12:06 jbartosik

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Sep 12 '22 12:09 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

Oct 12 '22 12:10 k8s-triage-robot

/remove-lifecycle rotten

Oct 14 '22 08:10 jbartosik

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Jan 12 '23 09:01 k8s-triage-robot

/remove-lifecycle stale

Jan 12 '23 16:01 jbartosik

/remove-lifecycle rotten

Jan 12 '23 16:01 jbartosik

/lifecycle frozen

Jan 12 '23 16:01 jbartosik

Curious about this too. Changes have been made to my pod and therefore resource needs, but deleting and recreating the vpa uses historical data still.

Jan 14 '23 09:01 2fst4u

By default, historical data is stored in a VPACheckpoint resource and the vpa-recommender keeps it also in memory for a while. So if you want to get rid of all historical information for a Pod, you probably should

remove the corresponding VPACheckpoint resource
restart vpa-recommender or wait for the garbage collection interval to pass (1 hour)

Jan 16 '23 12:01 voelzmo

* remove the corresponding `VPACheckpoint` resource

* restart `vpa-recommender` or wait for the garbage collection interval to pass (1 hour)

Did not work for me. I had to delete the VPA itself (and before that, disable Argo CD to prevent instant recreation), then restart vpa-recommender and then recreate the VPA.

Mar 12 '24 10:03 BalzGuenat

autoscaler autoscaler copied to clipboard

VPA: deleting VPA and recreate it, still gives some recommendation

autoscaler
autoscaler copied to clipboard