autoscaler Support in-place resource update feature for VPA

Which component are you using?: Vertical Pod Autoscaler

Is your feature request designed to solve a problem? If so, describe the problem this feature should solve.: In Kubernetes 1.25, the in-place resource update feature would be available in alpha. Any pods with this feature enabled would have a field to specify the restarting policies as the following. Pods controlled by VPA with a NoRestart policy should not be evicted by the VPA updater. Instead, it should be patched with the new recommended request directly.

apiVersion: v1
kind: Pod
metadata:
  name: 1pod
spec:
  containers:
  - name: busybox
    image: busybox
    command:
      - sleep
      - "3600"
    imagePullPolicy: IfNotPresent
    resources:
      limits:
        cpu: "1"
        memory: "1Gi"
      requests:
        cpu: "1"
        memory: "1Gi"
    resizePolicy:
    - resourceName: cpu
      policy: NoRestart
    - resourceName: memory
      policy: NoRestart

Describe the solution you'd like.: In the VPA updater, whenever existing logic decides to evict the pod, we should add a check on the pod spec to determine if NoRestart policy is enabled. If so, a patch request should be sent via updater directly to the pod without evicting it.

An example patch command looks like this:

		patchCommand := fmt.Sprintf(`{
			"spec":{
					"containers":[
						{
							"name":"containerName", 
							"resources":{"requests":"cpu":%s},"limits":{"cpu":%s}}
						}]
					}
			}`, resizeCPUValStr)

                 testPod, pErr := f.ClientSet.CoreV1().Pods(testPod.Namespace).Patch(context.TODO(),
			testPod.Name, types.StrategicMergePatchType, []byte(patchCommand), metav1.PatchOptions{})

Describe any alternative solutions you've considered.: None

Additional context.: The place where the VPA updater evicts the pod is here: https://github.com/kubernetes/autoscaler/blob/fc0666c88e325ea32ceabcfa1f834f5e39e265a3/vertical-pod-autoscaler/pkg/updater/logic/updater.go#L196

Jul 25 '22 14:07 wangchen615

/assign @jbartosik

Jul 25 '22 14:07 wangchen615

Just to document a caveat regarding the capping of requests here, so we keep this in mind:

The Recommender only caps according to minAllowed and maxAllowed
The admission-controller is currently responsible for capping to the existing limit in case controlledValues: RequestsOnly is set or to an existing LimitRange in the namespace

~~We probably need to move the limit-capping logic to a different place, so we don't have different behavior for the eviction and in-place update case.~~

EDIT: Sorry, turns out I didn't look properly for uses of Apply. The Updater seems to already do the right things during the priority calculations: https://github.com/kubernetes/autoscaler/blob/fc0666c88e325ea32ceabcfa1f834f5e39e265a3/vertical-pod-autoscaler/pkg/updater/priority/update_priority_calculator.go#L82 so I think the podsForUpdate in the above loop do already have capped recommendations and we're fine!

Jul 27 '22 08:07 voelzmo

I'm writing this down and I'll send a PR when it's ready.

I was thinking about something a little different:

Add actuation modes so we have Recreate (exists already), InPlace, InPlaceWithFallbackToEvict
we might want controls for when to fall back to deleting pods
we might want to limit how frequently we update in place

I heard that some time ago scaling down in place could cause problems, so we might want to allow scalign up in place but evicting to scale down.

I didn't think about per-resource configuration (yet).

Jul 27 '22 19:07 jbartosik

Add actuation modes so we have Recreate (exists already), InPlace, InPlaceWithFallbackToEvict

Can you elaborate a bit on your thinking why we need additional actuation modes? In theory, these settings are already configured on the Pod itself with

resizePolicy:
    - resourceName: cpu
      policy: NoRestart
    - resourceName: memory
      policy: NoRestart

Jul 28 '22 07:07 voelzmo

Exception for merging https://github.com/kubernetes/kubernetes/pull/102884 was rejected so I think we will (unfortunately) have much time to discuss this.

Aug 02 '22 16:08 jbartosik

@voelzmo

The main reason I think we should have a new enum value for the new behaviour is that some uses may want to configure their VPAs to keep that behavior (even if VPA changes what Auto means in the future). Similarly to how currently VPA has Auto and Recreate modes which currently do the same thing. But we're discussing changing what Auto does (and some workloads might want Recreate behavior, not the new behavior).

Longer version:

I think that there are at lest the following kinds of workloads when looking at in-place updates

Want to be updated only in place or when something bad (for example OOM) is happening to them
Want to be updated. Can be in-place, can be by eviction.
Want to be updated by eviction or deletion, not in-place.
Don't want to be actively updated.

For 4) we have modes Off and Initial.

When VPA deducing if it should apply in-place update works for distinguishing 2 and 3 (also there's Recreate mode). But I think we're going to need to add a new enum value to distinguish between 1) and 2), we can't deduce that from containers. For start we can support only 2) and not 1)

So I think we'll need enum values to distinguish those.

And since we're going to add enum values we might as well add them at the same time we add new VPA behaviors (so if someone doesn't want to worry that we'll change what Auto means a few years later they can choose enum value for which behavior shouldn't change).

I'm not sure if will merge in time for https://github.com/kubernetes/kubernetes/pull/102884

The last update in the SIG-Node doc is from 2022-09-20 and it says that https://github.com/kubernetes/kubernetes/pull/102884 can merge after:

next containerd release,
e2e tests are fully enabled and
cgroupv2 review issues have been addressed

From comment it looks like containerd release will happen.

Oct 10 '22 08:10 jbartosik

One more reason to have this as a separate enum value is that since Auto is the default value it shouldn't be experimental / alpha feature.

Nov 03 '22 10:11 jbartosik

Thanks @jbartosik for clarifying your thoughts on adding new actuation modes! I took me a while to understand the semantics of resizePolicy for a container correctly: The point is that users can mark in-place resizing for their specific workload as safe or unsafe, but clients can still opt for resizing with evicting. Adding InPlace as new mode makes sense to me.

I'm not sure if we also need InPlaceWithFallbackToEvict – is it necessary to have InPlace with the semantics that is doesn't do a fallback to eviction? Wouldn't this mean a user creates a VPA object putting the resource of a container under control of VPA, marking it as unsafe for in-place resizing and then using VPA in actuation mode InPlace without the fallback to eviction? In that case, they could have left out the resource which doesn't support in-place resizing entirely from VPA? Given that currently all users of VPA rely on eviction to update resources, this seems like an unnecessary addition to the list of actuation modes?

Nov 03 '22 12:11 voelzmo

Sadly (source):

This feature missed the extended deadline after code freeze and therefore will not make it into 1.26.

Nov 28 '22 09:11 jbartosik

#4016 was opened earlier for the same thing, let's have one issue for tracking this

Jan 30 '23 14:01 jbartosik

autoscaler autoscaler copied to clipboard

Support in-place resource update feature for VPA

autoscaler
autoscaler copied to clipboard