managed-cluster-config
managed-cluster-config copied to clipboard
OCM-20819 | CNV-56734 | feat: prevent vCPU overcommit
What type of PR is this?
feature
What this PR does / why we need it?
AWS is legally responsible for having limited and accurate vCPU allocation for Microsoft License-Included customers on AWS EC2. Having a misallocation of vCPU would result in a breach of agreement between AWS and Microsoft for License-Included Windows software. This change deploys VAP on HCPs that prevents the overcommit of vCPUs for windows VMs.
Which Jira/Github issue(s) this PR fixes?
Special notes for your reviewer:
Pre-checks (if applicable):
-
[ ] Tested latest changes against a cluster
-
[ ] Included documentation changes with PR
-
[ ] If this is a new object that is not intended for the FedRAMP environment (if unsure, please reach out to team FedRAMP), please exclude it with:
matchExpressions: - key: api.openshift.com/fedramp operator: NotIn values: ["true"]
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: BraeTroutman Once this PR has been reviewed and has the lgtm label, please assign smarthall for approval. For more information see the Code Review Process.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
/retest
For other reviewers: here's what Claude said when I asked it to compare this PR against #2535 (the previous attempt):
Summary of Differences Between PR 2535 and PR 2588
Based on my analysis, here are the key differences between PR 2588 (open) and PR 2535 (merged then reverted):
Main Functional Change:
1. Additional cluster selector for Windows License-Included clusters
- PR 2588 adds a new cluster selector requirement:
api.openshift.com/win-li: "true"- This appears in multiple files:
deploy/srep-vap/vcpu-overcommit/config.yamldeploy/acm-policies/50-GENERATED-srep-vap-vcpu-overcommit.Policy.yaml- All three hack template files (integration, production, stage)
Minor Changes:
2. Whitespace cleanup
- PR 2588 removes a trailing blank line in
101-windows-11-vcpu-restrict.yaml(69 lines → 68 lines)3. Different approach in
scripts/generate-policy-config.py
- PR 2535: Simply added
"srep-vap/vcpu-overcommit"to the existing directories list- PR 2588: Commented out ALL existing directories and only includes
"srep-vap/vcpu-overcommit"
- This is a significant testing/development change that likely shouldn't be in production
Impact Analysis:
The most important difference is the addition of the
api.openshift.com/win-lilabel selector. This means:
- PR 2535: Would deploy to all HCP clusters (except 4.14-4.18)
- PR 2588: Will only deploy to HCP clusters that also have the Windows License-Included label
The
scripts/generate-policy-config.pychange in PR 2588 appears to be an unintended development/testing change where all other directories were commented out. This should likely be corrected before merging.
@BraeTroutman can you address the scripts/generate-policy-config.py issue? Can you also provide steps on how to test this PR in stage, so we can confirm that this won't cause an issue with existing Windows VMs in production?
@BraeTroutman: all tests passed!
Full PR test history. Your PR dashboard.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.