[FG:InPlacePodVerticalScaling] Tracking TODO items to address pre-beta, at beta, GA, and GA+1
What would you like to be added?
This enhancement tracks various TODO items from alpha to GA for the In-Place Pod Vertical Scaling feature. To find pending TODO items in the k/k repo, do:
git grep TODO | grep InPlacePodVerticalScaling
- In pkg/kubelet/kubelet_pods.go, update PodStatus.Resources to include extended resources. Target: <Beta.
- In pkg/kubelet/kubelet.go, investigate calling kl.handlePodResourcesResize in HandlePodUpdates + periodic SyncLoop
- In pkg/kubelet/kubelet.go, can we recover from SetPodAllocation/SetPodResizeStatus checkpointing failure if it were to occur? Target: < Beta.
- In pkg/kubelet/kuberuntime/helpers_linux.go, address issue that sets min req/limit to 2m/10m. Target: <Beta.
- In pkg/kubelet/kuberuntime/kuberuntime_manager.go, figure out enforceMemoryQoS usage in platform agnostic way. Target: <Beta.
- In pkg/kubelet/cri/remote/remote_runtime.go, remove v1alpha2 support for Windows if confirmed as unnecessary. Target: <Beta.
- In test/e2e/node/pod_resize.go, remove featureGatePostAlpha var. Target: Beta.
- In pkg/apis/core/validation/validation.go, remove updatablePodSpecFieldsNoResources variable. Target: GA.
- In pkg/apis/core/validation/validation.go, investigate if PodStatus.QOSClass can replace qos.GetPodQOS(). Target: GA.
- In pkg/kubelet/container/helpers.go, remove HashContainerWithoutResources() and associated code. Target: GA+1.
TODOs not tracked in code:
- Add and expose a helper function to get a pod's resource requirements and allocations for use by metrics, kubectl describe, etc
Why is this needed?
These were TODO items found during review of PR https://github.com/kubernetes/kubernetes/pull/102884/ and it was agreed they should not block alpha. Most need to be handled before Beta, and a few need to be addressed at GA/GA+1.
/sig node
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale - Mark this issue or PR as rotten with
/lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale - Mark this issue or PR as rotten with
/lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Reopen this issue with
/reopen - Mark this issue as fresh with
/remove-lifecycle rotten - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied- After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied- After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closedYou can:
- Reopen this issue with
/reopen- Mark this issue as fresh with
/remove-lifecycle rotten- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
/reopen
@vinaykul: Reopened this issue.
In response to this:
/reopen
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
/remove-lifecycle rotten
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Reopen this issue with
/reopen - Mark this issue as fresh with
/remove-lifecycle rotten - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied- After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied- After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closedYou can:
- Reopen this issue with
/reopen- Mark this issue as fresh with
/remove-lifecycle rotten- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
/reopen
@esotsal: You can't reopen an issue/PR unless you authored it or you are a collaborator.
In response to this:
/reopen
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
/reopen
@Karthik-K-N: Reopened this issue.
In response to this:
/reopen
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
/assign
/remove-lifecycle rotten
/triage accepted
Updated list of TODOs:
- [ ] pkg/apis/core/validation/validation.go:5071 - Drop this var once InPlacePodVerticalScaling goes GA and featuregate is gone.
- [ ] pkg/kubelet/cm/cgroup_manager_linux.go:645 - Add memory request support
- [ ] pkg/kubelet/cm/cgroup_manager_linux.go:731 - Add memory request support
- [ ] pkg/kubelet/container/helpers.go:128 - Remove this in GA+1 and make HashContainerWithoutResources to become Hash.
- [ ] pkg/kubelet/container/runtime.go:301 - Remove this in GA+1 and make HashWithoutResources to become Hash.
- [ ] pkg/kubelet/kubelet.go:1962 - Investigate doing this in HandlePodUpdates + periodic SyncLoop scan
- [ ] pkg/kubelet/kubelet.go:2588 - Can we recover from this in some way? Investigate
- [ ] pkg/kubelet/kubelet.go:2847 - Can we recover from this in some way? Investigate
- [ ] pkg/kubelet/kubelet.go:2855 - Can we recover from this in some way? Investigate
- [ ] pkg/kubelet/kubelet_pods.go:2107 - Update this to include extended resources in
- [ ] pkg/kubelet/kuberuntime/helpers_linux.go:63 - Address issue that sets min req/limit to 2m/10m before beta
- [ ] pkg/kubelet/kuberuntime/kuberuntime_container_linux_test.go:867 - Add unit tests for cgroup v1 & v2
- [ ] pkg/kubelet/kuberuntime/kuberuntime_manager.go:662 - Figure out best way to get enforceMemoryQoS value (parameter #4 below) in platform-agnostic way
- [ ] pkg/scheduler/internal/queue/scheduling_queue.go:1074 - Fix this to determine when a
- [ ] test/e2e/node/pod_resize.go:85 - Can we optimize this?
- [ ] test/e2e/node/pod_resize.go:334 - Is there a better way to determine this?
- [ ] test/e2e/node/pod_resize.go:500 - Remove this check once base-OS updates to containerd>=1.6.9
@esotsal would you mind updating the issue description to merge it with this list?
@tallclair @esotsal I've converted it to task list, please review. For the checkpoint failure TODOs, I want to toss out the node-local checkpointing code entirely and rely on podStatus as source of truth. (Please see https://github.com/kubernetes/kubernetes/pull/119665) @ndixita Does it still make sense to support setting memory request?
- [ ] pkg/kubelet/container/runtime.go:301 - Remove this in GA+1 and make HashWithoutResources to become Hash.
- [ ] pkg/kubelet/kubelet.go:1962 - Investigate doing this in HandlePodUpdates + periodic SyncLoop scan
With the merge of https://github.com/kubernetes/kubernetes/pull/124220 I think the both issues have been addressed.
@esotsal would you mind updating the issue description to merge it with this list?
Unfortunately @tallclair I am not allowed to modify issue description :-( , perhaps better to assign @vinaykul for this or close this issue and use https://github.com/orgs/kubernetes/projects/178 instead to track TODOs ?
@esotsal would you mind updating the issue description to merge it with this list?
Unfortunately @tallclair I am not allowed to modify issue description :-( , perhaps better to assign @vinaykul for this or close this issue and use https://github.com/orgs/kubernetes/projects/178 instead to track TODOs ?
I've updated it. I don't suppose there is a way to transfer ownership of issue, is there? (I can periodically keep it updated as bandwidth permits, but let's use the project board for most current info)
test/e2e/node/pod_resize.go:334 - Is there a better way to determine this? => Please check here proposal test/e2e/node/pod_resize.go:85 - Can we optimize this? => please check here proposal test/e2e/node/pod_resize.go:500 - Remove this check once base-OS updates to containerd>=1.6.9 ==> please check here proposal
With the merge of #124296 we can update test/e2e accordingly, and close those three TODOs i believe
Is there a plan for https://kubernetes.io/docs/concepts/workloads/pods/downward-api/#downwardapi-resourceFieldRef ? I guess this would work only for files?
/unassign