enhancements icon indicating copy to clipboard operation
enhancements copied to clipboard

KEP-2837: Pod level resource limits

Open liorokman opened this issue 5 years ago • 56 comments

keps/sig-node: add KEP for shared-burstable-limits - pod level resource limits

liorokman avatar Mar 04 '20 11:03 liorokman

Welcome @liorokman!

It looks like this is your first PR to kubernetes/enhancements 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/enhancements has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. :smiley:

k8s-ci-robot avatar Mar 04 '20 11:03 k8s-ci-robot

Hi @liorokman. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Mar 04 '20 11:03 k8s-ci-robot

/assign @dchen1107

liorokman avatar Mar 04 '20 11:03 liorokman

Kind reminder - can anyone review this?

liorokman avatar Mar 06 '20 12:03 liorokman

@liorokman the PR seems to have carried along a number of other KEPs. Can you update the PR to just have the KEP presented? Are you able to present at a future SIG Node meeting the high level overview of the proposal?

derekwaynecarr avatar Mar 06 '20 15:03 derekwaynecarr

/assign

derekwaynecarr avatar Mar 06 '20 15:03 derekwaynecarr

@liorokman the PR seems to have carried along a number of other KEPs. Can you update the PR to just have the KEP presented?

Done.

liorokman avatar Mar 06 '20 16:03 liorokman

Are you able to present at a future SIG Node meeting the high level overview of the proposal?

I'd be happy to. When is the next SIG Node meeting?

liorokman avatar Mar 06 '20 16:03 liorokman

Have you thought about the implications of this feature with vertical pod autoscaling?

If this feature is turned on, the VPA could be set to update the budget of one single container in the pod, and also set to monitor the pod-level cgroup for actual maximum resource consumption. This would make the VPA able to also work correctly for pods with this behavior enabled. In effect, the VPA would work the same, except that it would scale the memory for the entire pod instead of working on the container level.

By allowing the deployment to specify a resource budget for the entire pod and having the Linux kernel play with the details of allowing each container to allocate what it needs, when it needs it, the end result is a more flexible arrangement. This is a win-win scenario:

  • the developer doesn't need to micro-manage container limits but is not forced to grant unlimited resources to critical containers,
  • the application has less disruption because the VPA isn't evicting the pod to fine-tune the specific container limits,
  • the node is less susceptible to the noisy-neighbor problem with pods in the Burstable QoS level (also making the scheduler placement decisions better for this same reason).

Have you thought about how we could map this pod bounding behavior potentially to a RuntimeClass or kubelet configuration rather expose the knob directly in the end-user pod spec? I am not sure the average end-user would understand shareBurstableLimits

I see this feature as specific configuration for a specific pod, not as a general configuration that applies to all of the pods running on the node. I'm not sure how it would map to RuntimeClass either.

Maybe a different attribute name would be clearer? Maybe podResourceSharing instead of shareBurstableLimits?

liorokman avatar Mar 06 '20 16:03 liorokman

/cc

dashpole avatar Mar 06 '20 19:03 dashpole

/cc

egernst avatar Mar 10 '20 17:03 egernst

As suggested in the sig-node meeting, I will start by validating the usefulness of this feature with a DaemonSet that directly manipulates the cgroups in parallel with Kubelet.

I will report back once I have some results.

liorokman avatar Mar 18 '20 09:03 liorokman

As suggested in the sig-node meeting, I will start by validating the usefulness of this feature with a DaemonSet that directly manipulates the cgroups in parallel with Kubelet.

The DaemonSet can be found here.

liorokman avatar Mar 24 '20 11:03 liorokman

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

fejta-bot avatar Sep 02 '20 09:09 fejta-bot

/remove-lifecycle stale

liorokman avatar Sep 02 '20 10:09 liorokman

/ok-to-test

odinuge avatar Sep 20 '20 12:09 odinuge

New changes are detected. LGTM label has been removed.

k8s-ci-robot avatar Oct 07 '20 05:10 k8s-ci-robot

Hi all,

As an FYI: Enhancements Freeze is now in effect. If you wish to be included in the 1.20 Release, please submit an Exception Request as soon as possible.

As a note, I do not see a corresponding issue for this PR. We require that all KEPs have a milestoned and corresponding issue in the k/enhancements repo along with the following by the Enhancements Freeze deadline (that was Oct 6th):

  • The KEP must be merged in an implementable state
  • The KEP must have test plans
  • The KEP must have graduation criteria

Finally, the format of KEPs has changed. If you could please update and include the missing sections. See for ref https://github.com/kubernetes/enhancements/tree/master/keps/NNNN-kep-template

Best, Kirsten 1.20 Enhancements Lead

kikisdeliveryservice avatar Oct 07 '20 06:10 kikisdeliveryservice

/cc

vinaykul avatar Nov 09 '20 05:11 vinaykul

@vinaykul: GitHub didn't allow me to request PR reviews from the following users: vinaykul.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Nov 09 '20 05:11 k8s-ci-robot

Ping me when you're ready to revisit this :)

thockin avatar Nov 23 '20 16:11 thockin

Hi @liorokman,

Would you please open a tracking issue in this repo for your KEP and update its number to match? Thanks!

/hold

ehashman avatar Jan 13 '21 18:01 ehashman

Enhancements freeze next week - are we going to try to get this into 21?

thockin avatar Feb 01 '21 23:02 thockin

This currently isn't in the 1.21 node planning doc: https://docs.google.com/document/d/1U10J0WwgWXkdYrqWGGvO8iH2HKeerQAlygnqgDgWv4E/edit#

ehashman avatar Feb 01 '21 23:02 ehashman

/cc

chenyw1990 avatar Apr 27 '21 09:04 chenyw1990

AFAICT This has no enhancement tracking issue

thockin avatar Apr 30 '21 22:04 thockin

/cc

n4j avatar Jul 24 '21 06:07 n4j

@liorokman, are you still pursuing this :) ?

n4j avatar Jul 25 '21 05:07 n4j

@liorokman, are you still pursuing this :) ?

Sorry, not anymore.

liorokman avatar Jul 25 '21 16:07 liorokman

@liorokman, are you still pursuing this :) ?

Sorry, not anymore.

Mind if I take a stab at it 🙂?

n4j avatar Jul 26 '21 03:07 n4j