autoscaler icon indicating copy to clipboard operation
autoscaler copied to clipboard

Add KEP: Control VPA eviction behavior based on scaling direction and resource

Open voelzmo opened this issue 2 years ago • 15 comments

Which component this PR applies to?

vertical-pod-autoscaler

What type of PR is this?

/kind feature /kind api-change

What this PR does / why we need it:

Discuss introduction of a new UpdateMode for VPA as previously discussed in the SIG meeting.

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


voelzmo avatar Apr 27 '22 15:04 voelzmo

Welcome @voelzmo!

It looks like this is your first PR to kubernetes/autoscaler 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/autoscaler has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. :smiley:

k8s-ci-robot avatar Apr 27 '22 15:04 k8s-ci-robot

This PR may require API review.

If so, when the changes are ready, complete the pre-review checklist and request an API review.

Status of requested reviews is tracked in the API Review project.

k8s-triage-robot avatar Apr 27 '22 16:04 k8s-triage-robot

Thanks for the PR. I've been a way for a while. I'll try to take a look at this PR soon.

jbartosik avatar May 09 '22 09:05 jbartosik

@elmiko

what would happen if a user downgrades their kubernetes version with this option in use on their VPAs

I'm not sure how this is related to kubernetes versions. Are you asking what would happen when users downgrade their VPA? I don't think there is a good way to make this work, as any fallback logic would require a new VPA release as well, right?

voelzmo avatar May 16 '22 14:05 voelzmo

Are you asking what would happen when users downgrade their VPA?

@voelzmo yeah, that's more what i was asking about. apologies for the confusion.

I don't think there is a good way to make this work, as any fallback logic would require a new VPA release as well, right?

that makes sense to me. i am trying to understand the scope of things that a user might have to deal with.

elmiko avatar May 26 '22 14:05 elmiko

Thanks for your thoughts on this, @jbartosik!

My understanding is that you're talking about two different alternatives to my proposal here, so let me try to address them one-by-one.

I propose we: add EvictionRequirements field to the PodUpdatePolicy(...)

This seems to solve a more sophisticated problem than initially proposed. This proposal was meant to reduce the negative effects of evictions for certain workloads which are very sensitive to that. If a workload needs more resources (regardless which kind), it should still get evicted such that it can continue do its work – but if we're trying to save cost (i.e. recommendation < requests), we don't evict. If memory and CPU have different trends (let's say recommendation < requests for memory, but recommendation > requests for CPU), then we would still want to evict to get the higher requests for CPU, so the workload gets the processing power it needs.

All of this probably wouldn't be necessary, if In-place Update of Pod Resources would be a thing – but I guess that will still take a while ;)

As I mentioned above: We currently have a different controller in place that helps us achieve this goal using a different method and currently we don't have the need to distinguish between resources.

we could have a separate config in ContainerResourcePolicy to ensure that Admission Controller isn't lowering pods request.

Are you proposing to add a new spec.resourcePolicy.containerPolicies[].mode: upscaleOnly (or called whatever) alongside Auto and Off instead of adding a new spec.updatePolicy.updateMode: upscaleOnly? If I'm misunderstanding this, please disregard my comments below.

I'm not sure if that semantically fits into what spec.resourcePolicy.containerPolicies[].mode is doing (Autoscaler being enabled or not) and the granularity it works on (per Container, not per VPA). Additionally, I'm not entirely sure what it would mean for Pods with multiple containers: The admission-webhook wouldn't evict the Pod if a single container has a mode: upscaleOnly and for that specific container we would see recommendation < requests? Or would it still evict, but not touch this Container's requests? In this case, this wouldn't serve the original goal: reduce evictions for Pods that don't need more resources to work properly.

voelzmo avatar Jun 17 '22 15:06 voelzmo

My understanding is that you're talking about two different alternatives to my proposal here, so let me try to address them one-by-one.

Yes. This part (above a horizontal line) is about changes I want to make to API in this KEP.

I propose we: add EvictionRequirements field to the PodUpdatePolicy(...)

This seems to solve a more sophisticated problem than initially proposed. This proposal was meant to reduce the negative effects of evictions for certain workloads which are very sensitive to that. If a workload needs more resources (regardless which kind), it should still get evicted such that it can continue do its work – but if we're trying to save cost (i.e. recommendation < requests), we don't evict. If memory and CPU have different trends (let's say recommendation < requests for memory, but recommendation > requests for CPU), then we would still want to evict to get the higher requests for CPU, so the workload gets the processing power it needs.

The API is a little bigger (adding a new message instead of a new enum value) but I hope that implementation should take about the same amount of work. And it's more future proof. Let's say we add another enum value to encode "evict only if at least 1 resource will be increased, in addition to all the usual conditions, apply recommendations to containers when they start". Then it's reasonable to get another enum value for evicting only when memory is being increased. And another one for a resource would be decreased (when someone is setting high starting values)...

All of this probably wouldn't be necessary, if In-place Update of Pod Resources would be a thing – but I guess that will still take a while ;)

Yes, I looking into that but I hear SIG node has a lot of work so it will take a while.

As I mentioned above: We currently have a different controller in place that helps us achieve this goal using a different method and currently we don't have the need to distinguish between resources.

I want to support your use case. But I also need to make sure that our API remains in good shape and can support needs of all our users. Like I wrote above I'm worried that supporting your use case by adding a new mode will lead to explosion in number of modes we're supporting and make the API confusing.


This is about a different improvement which was inspired by this one. I thought I'll mention it in case you're interested in it. But if you're not then let's focus on the change you want to contribute (evicting pods only when VPA wants to increase their requests / part above the horizontal line in this comment).

we could have a separate config in ContainerResourcePolicy to ensure that Admission Controller isn't lowering pods request.

Are you proposing to add a new spec.resourcePolicy.containerPolicies[].mode: upscaleOnly (or called whatever) alongside Auto and Off instead of adding a new spec.updatePolicy.updateMode: upscaleOnly? If I'm misunderstanding this, please disregard my comments below.

I'm not sure if that semantically fits into what spec.resourcePolicy.containerPolicies[].mode is doing (Autoscaler being enabled or not) and the granularity it works on (per Container, not per VPA). Additionally, I'm not entirely sure what it would mean for Pods with multiple containers: The admission-webhook wouldn't evict the Pod if a single container has a mode: upscaleOnly and for that specific container we would see recommendation < requests? Or would it still evict, but not touch this Container's requests? In this case, this wouldn't serve the original goal: reduce evictions for Pods that don't need more resources to work properly.

No. I meant that in addition to what you proposed in this KEP we could also make a change to spec.ResourcePolicy.ContainerPolicies, add a field that will tell VPA Admission Controller that it shouldn't set requests lower than they were when a pod came in. This is kind of independent from the change you described in this KEP (just inspired by it so I thought I'd mention it in case you're interested).

jbartosik avatar Jun 17 '22 16:06 jbartosik

The API is a little bigger (adding a new message instead of a new enum value) but I hope that implementation should take about the same amount of work.

I was thinking a bit on how far we would want to take the validation bit of the implementation: Which kind of check would we build to avoid the conditions can never be true? We could

  • restrict to a single condition per resource, to avoid having e.g. recommendation > resources and recommendation < resources
  • ensure that Any cannot be specified when there is a condition in place for both resources

This would probably be enough? It still is additional logic we should (need?) to have in place that we otherwise wouldn't need.

I want to support your use case. But I also need to make sure that our API remains in good shape and can support needs of all our users. Like I wrote above I'm worried that supporting your use case by adding a new mode will lead to explosion in number of modes we're supporting and make the API confusing.

I absolutely understand this concern and I'm not saying we shouldn't implement it this way. I can see how this API design makes sense if you're expecting additional scenarios in the future that you'd want to support here. Do you have thoughts on the use-cases for any of the other scenarios or even concrete input/feedback that this is something that people want to configure in a more fine-granular way?

Are you proposing to add a new spec.resourcePolicy.containerPolicies[].mode: upscaleOnly (or called whatever) alongside Auto and Off instead of adding a new spec.updatePolicy.updateMode: upscaleOnly? If I'm misunderstanding this, please disregard my comments below. (...)

No. I meant that in addition to what you proposed in this KEP we could also make a change to spec.ResourcePolicy.ContainerPolicies, add a field that will tell VPA Admission Controller that it shouldn't set requests lower than they were when a pod came in. This is kind of independent from the change you described in this KEP (just inspired by it so I thought I'd mention it in case you're interested).

Thanks for helping me understand! From my point-of-view, we don't need a mechanism like this in the admission-controller that would prevent recommendation < requests to be updated on the Pods.

voelzmo avatar Jun 20 '22 11:06 voelzmo

The API is a little bigger (adding a new message instead of a new enum value) but I hope that implementation should take about the same amount of work.

I was thinking a bit on how far we would want to take the validation bit of the implementation: Which kind of check would we build to avoid the conditions can never be true? We could

  • restrict to a single condition per resource, to avoid having e.g. recommendation > resources and recommendation < resources
  • ensure that Any cannot be specified when there is a condition in place for both resources

This would probably be enough? It still is additional logic we should (need?) to have in place that we otherwise wouldn't need.

Yes, I think this is good.

Someone could set

CPU: target >= request Memory: target >= request Any: target > request

But setting Any doesn't make any difference - we won't evict if there is no change even without it.

I want to support your use case. But I also need to make sure that our API remains in good shape and can support needs of all our users. Like I wrote above I'm worried that supporting your use case by adding a new mode will lead to explosion in number of modes we're supporting and make the API confusing.

I absolutely understand this concern and I'm not saying we shouldn't implement it this way. I can see how this API design makes sense if you're expecting additional scenarios in the future that you'd want to support here. Do you have thoughts on the use-cases for any of the other scenarios or even concrete input/feedback that this is something that people want to configure in a more fine-granular way?

Are you proposing to add a new spec.resourcePolicy.containerPolicies[].mode: upscaleOnly (or called whatever) alongside Auto and Off instead of adding a new spec.updatePolicy.updateMode: upscaleOnly? If I'm misunderstanding this, please disregard my comments below. (...)

No. I meant that in addition to what you proposed in this KEP we could also make a change to spec.ResourcePolicy.ContainerPolicies, add a field that will tell VPA Admission Controller that it shouldn't set requests lower than they were when a pod came in. This is kind of independent from the change you described in this KEP (just inspired by it so I thought I'd mention it in case you're interested).

Thanks for helping me understand! From my point-of-view, we don't need a mechanism like this in the admission-controller that would prevent recommendation < requests to be updated on the Pods.

So let's not do that.

jbartosik avatar Jun 27 '22 14:06 jbartosik

@jbartosik Should I change this KEP to reflect our state of discussion, so we can get this in?

voelzmo avatar Jul 21 '22 12:07 voelzmo

Joachim Should I change this KEP to reflect our state of discussion, so we can get this in?

Please do. I think this mostly ready.

jbartosik avatar Jul 25 '22 13:07 jbartosik

@jbartosik Does this update version work for you? Once this is merged I could start implementing it.

voelzmo avatar Aug 29 '22 07:08 voelzmo

Looks good to me.

We still need someone to check the API.

I think it will need to add code snippet with definition API changes to make it feasible to check the API, something like (ChangeRequirementType needs a better name)

type EvictionRequirement struct {
  Resource                    ResourceName
  ChangeRequirement ChangeRequirementType
}

type ResourceName string
type ChangeRequirementType string
const(
 ResourceNameCpu = ...

Does https://github.com/kubernetes/autoscaler/pull/5176/files help to review the API changes?

voelzmo avatar Sep 08 '22 12:09 voelzmo

Looks good to me. We still need someone to check the API. I think it will need to add code snippet with definition API changes to make it feasible to check the API, something like (ChangeRequirementType needs a better name)

type EvictionRequirement struct {
  Resource                    ResourceName
  ChangeRequirement ChangeRequirementType
}

type ResourceName string
type ChangeRequirementType string
const(
 ResourceNameCpu = ...

Does https://github.com/kubernetes/autoscaler/pull/5176/files help to review the API changes?

Probably.

/lgtm

@mwielgus please take a look at the API changes

jbartosik avatar Sep 09 '22 08:09 jbartosik

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: voelzmo Once this PR has been reviewed and has the lgtm label, please ask for approval from jbartosik by writing /assign @jbartosik in a comment. For more information see:The Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Sep 26 '22 12:09 k8s-ci-robot

New changes are detected. LGTM label has been removed.

k8s-ci-robot avatar Oct 31 '22 09:10 k8s-ci-robot

I noticed that API here is different than in #5176:

  • This PR says in ChangeRequirement you pick on of {CPU, Memory or Any}
  • In the other PR you can pick 1 or 2 or {CPU, Memory} (and it does "or" if you pick more than 1)

Please reconcile those two PRs (especially if you plan to use #5176 to demonstrate what API will look like)

done.

voelzmo avatar Oct 31 '22 09:10 voelzmo

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: voelzmo Once this PR has been reviewed and has the lgtm label, please ask for approval from jbartosik by writing /assign @jbartosik in a comment. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Oct 31 '22 09:10 k8s-ci-robot

@jbartosik Can we merge this or are there still things you want me to address?

voelzmo avatar Nov 14 '22 15:11 voelzmo

https://github.com/kubernetes/autoscaler/pull/4831#pullrequestreview-1099346882

We still need someone to check the API.

@mwielgus can you please take a look?

jbartosik avatar Nov 22 '22 13:11 jbartosik

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mwielgus, voelzmo

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Dec 19 '22 14:12 k8s-ci-robot