kep-4622: promote topologyManager policy: max-allowable-numa-nodes to GA
- One-line PR description: topologyMnagaer policy: max-allowable-numa-nodes to GA(1.33)
- Issue link: https://github.com/kubernetes/enhancements/issues/4622
- Other comments:
- it was created in 1.31 as a beta feature: https://github.com/kubernetes/enhancements/pull/4624
- The GA graduation would mean:
- the max-allowable-numa-nodes option will graduate to GA
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: cyclinder Once this PR has been reviewed and has the lgtm label, please assign johnbelamaric for approval. For more information see the Code Review Process.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
I am supportive of graduating features to GA but not sure if we will get PRR in time as this came in late.
I am supportive of graduating features to GA but not sure if we will get PRR in time as this came in late.
same. I'll still be helping here with the reviews and cleaning up the feature, but if we miss the deadline (as unfortunately seems likely) is not too bad: we can prepare and be ready for a early 1.34 merge.
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.
This bot triages PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the PR is closed
You can:
- Mark this PR as fresh with
/remove-lifecycle stale - Close this PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
/retitle kep-4622: promote topologyManager policy: max-allowable-numa-nodes to GA
I can only imagine as possible improvement to extend that metric.
Agree, do you mean that we need another metric for this option? We already have an existing metric: topology_manager_admission_duration_ms now.
Do we already have tests in the codebase to ensure that this policy option works as expected and is compatible with other Topology Manager policy options or planned as part of GA graduation?
We already have an e2e test for this in https://github.com/kubernetes/kubernetes/pull/124148. see https://github.com/kubernetes/kubernetes/blob/0154f8a222c38d5808f1aecf218379cedf15e8a7/test/e2e_node/topology_manager_test.go#L257
I think we can have some e2e tests which happen to set this value and then run a subset of the existing topology manager tests. This should be easy, cheap enough and will give us enough coverage. I can't say if this warrants a beta2 or can be done in the context of the GA graduation.
I am open to having some other e2e tests, Do we need to do this in 1.34 or 1.35?
I think we can have some e2e tests which happen to set this value and then run a subset of the existing topology manager tests. This should be easy, cheap enough and will give us enough coverage. I can't say if this warrants a beta2 or can be done in the context of the GA graduation.
I am open to having some other e2e tests, Do we need to do this in 1.34 or 1.35?
from what I collected, it's fair and accepted to add e2e tests in the same cycle on which we graduate to beta. But everything should be set before to move to GA.
looks like the label lead-opt-in was missing, so this PR didn't got PRR review?
Let's identify all the gaps from sig-node perspective, work on them and aim for early trivial GA move next cycle
according to https://github.com/kubernetes/enhancements/pull/5242 my understanding is we need to have another beta to complete the missing e2e tests. Other than that LGTM me. Not adding the label because the PR wants to move to GA, and turns out we need a beta2 to address the gaps
According to my understanding, I need to submit another PR, similar to https://github.com/kubernetes/enhancements/pull/5242, and also add an e2e test to Kubernetes. Only after these are completed can this PR receive the required label and be merged, right?
I think we can have some e2e tests which happen to set this value and then run a subset of the existing topology manager tests
We already have an e2e test for this in https://github.com/kubernetes/kubernetes/pull/124148. see https://github.com/kubernetes/kubernetes/blob/0154f8a222c38d5808f1aecf218379cedf15e8a7/test/e2e_node/topology_manager_test.go#L1449. . This is the most basic verification that the changed setting is not breaking stuff for maxNUMANodes(16), Do we still need an additional e2e test?
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.
This bot triages PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the PR is closed
You can:
- Mark this PR as fresh with
/remove-lifecycle stale - Close this PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
@haircommander I'm updating (preserving @cyclinder autorship of course) in https://github.com/kubernetes/enhancements/pull/5627
got it thanks, I updated the description in https://github.com/kubernetes/enhancements/issues/4622 to reflect