enhancements icon indicating copy to clipboard operation
enhancements copied to clipboard

KEP-3008: QoS-class resources

Open marquiz opened this issue 3 years ago • 87 comments

  • One-line PR description: New KEP for adding class-based resources to CRI protocol
  • Issue link: #3008
  • Other comments:

marquiz avatar Oct 07 '21 14:10 marquiz

Skipping CI for Draft Pull Request. If you want CI signal for your change, please convert it to an actual PR. You can still manually trigger a test run with /test all

k8s-ci-robot avatar Oct 07 '21 14:10 k8s-ci-robot

/cc @kad @haircommander

marquiz avatar Oct 07 '21 14:10 marquiz

Hi @marquiz

All KEP PRs must have an open issue in k/enhancements (this repo). Please open an issue and fill it out completely and rename this PR to KEP-issue number but in the title of this PR and in your README.md

Thanks!

kikisdeliveryservice avatar Oct 08 '21 21:10 kikisdeliveryservice

All KEP PRs must have an open issue in k/enhancements (this repo). Please open an issue and fill it out completely and rename this PR to KEP-issue number but in the title of this PR and in your README.md

Thanks for the guidance @kikisdeliveryservice. Done

marquiz avatar Oct 11 '21 06:10 marquiz

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jan 17 '22 12:01 k8s-triage-robot

/retitle KEP-3008: Class-based resources

marquiz avatar Feb 01 '22 09:02 marquiz

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Mar 31 '22 19:03 k8s-triage-robot

Working hard on this one /remove-lifecycle rotten

marquiz avatar Apr 07 '22 13:04 marquiz

cc @phanhuy1502 a KEP related to Intel RDT

haosdent avatar Apr 12 '22 17:04 haosdent

@marquiz I saw similar implementations in https://koordinator.sh/docs/core-concepts/resource-model and https://github.com/Tencent/caelus . Maybe you could take references. Their user stories are colocating batch jobs with online services, while the generic cpuset/network bandwidth limit is insufficient.

haosdent avatar Apr 12 '22 18:04 haosdent

@marquiz I saw similar implementations in https://koordinator.sh/docs/core-concepts/resource-model and https://github.com/Tencent/caelus . Maybe you could take references. Their user stories are colocating batch jobs with online services, while the generic cpuset/network bandwidth limit is insufficient.

Thanks @haosdent! On quick glance those look more like solutions for dynamic load balancing and more intelligent scheduling. We're aiming a bit lower level here: low-level support for enabling these class-based/qos type of resources. I think they could start to use this mechanism (class resources in K8s) instead of resorting to project-specific "sideband" solutions.

marquiz avatar Apr 13 '22 08:04 marquiz

RDT support in CRI-O released v1.22, containerd merged)

More details in https://github.com/kubernetes/kubernetes/issues/92287#issuecomment-1011165300

pacoxu avatar May 26 '22 07:05 pacoxu

I included this to 1.25 milestone even there are some open questions raised at SIG Node last time we met.

dchen1107 avatar Jun 09 '22 21:06 dchen1107

/cc @endocrimes

marquiz avatar Jun 10 '22 14:06 marquiz

I included this to 1.25 milestone even there are some open questions raised at SIG Node last time we met.

Since then I removed the Pod spec changes and replaced those with annotations. The annotations would be basically fall-through to the corresponding fields in the CRI API with no logic/processing done on kubelet side. Another open was blockio which we discussed and I demonstrated in last SIG Node meeting (2022-06-07).

We'll have a re-review in next SIG-Node

marquiz avatar Jun 10 '22 14:06 marquiz

/cc @mikebrow

marquiz avatar Jun 13 '22 13:06 marquiz

Updates:

  • rebased
  • sync with latest KEP template
  • add graduation criteria
  • update test plan
  • add kep.yaml
  • add prod-readiness/sig-node/3008.yaml (placeholder)

marquiz avatar Jun 13 '22 16:06 marquiz

Working implementation in my personal repo: https://github.com/marquiz/kubernetes/tree/devel/class-resources-annotations

marquiz avatar Jun 13 '22 16:06 marquiz

Thanks @SergeyKanzhelev for your insightful review. I answered all you comments but didn't address the (im)mutability. I want to ponder that a bit more before updating the KEP 🙂

marquiz avatar Jun 15 '22 15:06 marquiz

Some slightly larger updates to the KEP:

  • The naming ("class-based resources" or "class resources") has been a great source of confusion. I now renamed it to "QoS-class resources". Any thoughts about this?
  • I responded to @SergeyKanzhelev´s feedback on resource updates: added support to UpdateContainerResourcesRequest message and also discuss immutable resources

marquiz avatar Jun 17 '22 19:06 marquiz

Relatedly -- what do the key and value here represent here? I assume the key would be the class name, but what is the value? Or is the key something else and the value is the class name?

Key is the resource type (name of the resoure type, e.g. rdt or blockio. Value is the class (within that resource type) that the container gets assignmet to. I reworded this part now so I hope it is a bit clearer.

marquiz avatar Jun 21 '22 14:06 marquiz

Also, you don't really need to talk about future stuff for the PRR - it can be scoped to this KEP which you say is only Phase 1.

OK, thanks. Do you think I should remove the mentions about future implementation phases in the PRR questionnaire?

marquiz avatar Jun 22 '22 08:06 marquiz

No need to remove the later phases stuff from PRR. PRR looks fine, but I will approve after the SIG approval is in.

johnbelamaric avatar Jun 23 '22 03:06 johnbelamaric

Renaming to sync with the latest content

/retitle QoS-class resources

marquiz avatar Jul 08 '22 14:07 marquiz

Review comments (especially from @SergeyKanzhelev and @klueska) regarding e.g. updates/scaling made me re-think the relation between container-level vs. pod-level QoS-class resources. It was fuzzy, not clearly spelled-out (or even thought-out) and certainly prone to create confusion. I now made a "medium-size" update to the KEP clarifying this, making clear separation between pod-level and container-level QoS-class resources (I think this is the way it should be 😇).

TODO: Some of the future steps sections need to be synced with this.

marquiz avatar Jul 08 '22 15:07 marquiz

Thanks for the review comments @sftim!

I need to call it a day now and will be off for four weeks 🙄 I will get back to this right after my break.

marquiz avatar Jul 08 '22 17:07 marquiz

@sftim et al, I today submitted a PR to K8s blog about this subject: https://github.com/kubernetes/website/pull/34880

marquiz avatar Jul 08 '22 17:07 marquiz

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: marquiz Once this PR has been reviewed and has the lgtm label, please assign johnbelamaric for approval by writing /assign @johnbelamaric in a comment. For more information see:The Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Sep 13 '22 12:09 k8s-ci-robot

Thanks for the review @haircommander! I tried to answer and address your comments the best I can. One notable change is that I moved future steps from Non-goals to the Goals section. They are not goals of the first phase 1 but are goals of the KEP going to beta and further.

marquiz avatar Sep 19 '22 17:09 marquiz

Thanks for this KEP!

I have a general question. Is the idea that any QoS classes will always be plumbed into CRI? I'm asking because future versions of PD on GCP are going to allow dynamic allocation of IOPS per volume (ie like this but able to be modified after creation).

This means that a controller could look at pod IO requirements (set through a resource request or some QoS class) and adjust provisioning on volumes used by the pod, or if a set of pods is using the same volume as read-only-many, the IO requirements could be scaled on the volume as well.

Do you think this sort of usage is in-scope for this KEP, or is this really focused on k8s->CRI features?

mattcary avatar Sep 21 '22 23:09 mattcary

+1

duanmengkk avatar Sep 22 '22 01:09 duanmengkk