autoscaler
autoscaler copied to clipboard
VPA: deleting VPA and recreate it, still gives some recommendation
Which component are you using?: vertical-pod-autoscaler
What version of the component are you using?: Component version: 0.10.0
What k8s version are you using (kubectl version)?:
kubectl version Output
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.5", GitCommit:"e338cf2c6d297aa603b50ad3a301f761b4173aa6", GitTreeState:"clean", BuildDate:"2020-12-09T11:18:51Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.5", GitCommit:"e338cf2c6d297aa603b50ad3a301f761b4173aa6", GitTreeState:"clean", BuildDate:"2020-12-09T11:10:32Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"linux/amd64"}
What environment is this in?:
What did you expect to happen?:
We are using VPA to make recommendation for our pods. We just start the vpa as it is written. Only difference was that we added the the pod-recommendation-min-memory-mb=10 flag for the recommender deployment. We ran some test with 100TPS for an hour and we got some recommendation. Until this step everything was fine. Let's say we got 500m CPU recommendation
Then after one day no test, the recommandation was down, let's say to 130mCPU. I think this is still good, since the recommender uses the data what it is collected from the start.
Now we removed the VPA totally with the vpa-down.sh script. then I created again and also created the VPA resource. When I checked the recommendation it still showed the 130mCPU, while I thought, since I stop the VPA totally, that it will be nothing or only the minimum value. so it seems that somewhere, somebody is storing some data.
What happened instead?:
Recommendation did not disapperared after VPA uninstall/reinstall
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
As far as I know the default vpa-up.sh does not setup the Promethous history data collection. I did not set it either, so I do not know why this porblem happened.
Can you help me what are we doing wrongly?
Thanks
Akos
Please add more information. Preferably step-by-step reproduction instructions and explain what happened vs what you expected
Hi
What I did earlier is the following:
- install VPA as describer. Only difference was that the minimum memory was set to 10Mbyte
kubectl -n kube-system describe deployments.apps vpa-recommender
Name: vpa-recommender
Namespace: kube-system
CreationTimestamp: Fri, 04 Mar 2022 13:25:57 +0000
Labels: <none>
Annotations: deployment.kubernetes.io/revision: 1
Selector: app=vpa-recommender
Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app=vpa-recommender
Service Account: vpa-recommender
Containers:
recommender:
Image: k8s.gcr.io/autoscaling/vpa-recommender:0.9.2
Port: 8942/TCP
Host Port: 0/TCP
Args:
--pod-recommendation-min-memory-mb=10
Limits:
cpu: 200m
memory: 1000Mi
Requests:
cpu: 50m
memory: 500Mi
Environment: <none>
Mounts: <none>
Volumes: <none>
Conditions:
Type Status Reason
---- ------ ------
Available True MinimumReplicasAvailable
Progressing True NewReplicaSetAvailable
OldReplicaSets: <none>
NewReplicaSet: vpa-recommender-5c5bbcb6cf (1/1 replicas created)
Events: <none>
- Start the VPA for my application
apiVersion: "autoscaling.k8s.io/v1"
kind: VerticalPodAutoscaler
metadata:
name: parameterprovision
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment
name: parameterprovision
updatePolicy:
updateMode: "Off"
-
I ran some performance test with 100tps. Then I can see that the reccomendation is showing some value, which seems to be OK.
-
then I stopped the test case and removed the VPA from k8s (vpa-down.sh).
-
Now giving the vpa-up.sh again and create the same vpa. Giving the vpa describer it shows the same values as it was earlier, but currently there is no load on it. so it seems that it remembers some earler data.
actually it became more weired today....after the weekend I checked the recommendation again and also checked the top pods: I did not run anything in the last two days on the cluster.
kubectl top pods| grep prov
parameterprovision-6b8988c8fd-5m6q4 3m 27Mi
while the vpa describer shows this:
Recommendation:
Container Recommendations:
Container Name: nef-parameterprovision
Lower Bound:
Cpu: 25m
Memory: 49530131
Target:
Cpu: 25m
Memory: 109814751
Uncapped Target:
Cpu: 25m
Memory: 109814751
Upper Bound:
Cpu: 698m
Memory: 150053070
The targer memory is 105Mbytes, but there is no load for two days in this pod.
And Now I deleted VPA, scale my deployement to 0, then scale it to 1 again, create VPA and the result is the following:
nef-parameterprovision-6b8988c8fd-5t46m 2m 20Mi
Recommendation:
Container Recommendations:
Container Name: nef-parameterprovision
Lower Bound:
Cpu: 25m
Memory: 33289666
Target:
Cpu: 25m
Memory: 36253748
Uncapped Target:
Cpu: 25m
Memory: 36253748
Upper Bound:
Cpu: 490m
Memory: 1615842163
I see a quite big difference between the top and the recommendation.
so what is the right way to "reset" the vpa and start a new measurement. Actually we just want to get a recommendation when the load is 100 tps, but now for me it seems that I cannot reproduce twice the same result....
Is there somewhere explained how the recommendation is calculated? Thanks
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale - Mark this issue or PR as rotten with
/lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
I didn't have time to look into the problem but I think it would be good to do that.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale - Mark this issue or PR as rotten with
/lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
/remove-lifecycle rotten
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale - Mark this issue or PR as rotten with
/lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
/remove-lifecycle rotten
/lifecycle frozen
Curious about this too. Changes have been made to my pod and therefore resource needs, but deleting and recreating the vpa uses historical data still.
By default, historical data is stored in a VPACheckpoint resource and the vpa-recommender keeps it also in memory for a while. So if you want to get rid of all historical information for a Pod, you probably should
- remove the corresponding
VPACheckpointresource - restart
vpa-recommenderor wait for the garbage collection interval to pass (1 hour)
* remove the corresponding `VPACheckpoint` resource * restart `vpa-recommender` or wait for the garbage collection interval to pass (1 hour)
Did not work for me. I had to delete the VPA itself (and before that, disable Argo CD to prevent instant recreation), then restart vpa-recommender and then recreate the VPA.