k8s.io
k8s.io copied to clipboard
N2 Quota changes for Scale Projects
Kubernetes project uses E2 instances on GCP by default unless we are testing something that requires specific instance types(GPU tests, scale perf testing, arm64)
k/k change: https://github.com/kubernetes/kubernetes/pull/118626
With E2, the VMs issued by Google can run on modern AMD Epyc or ancient Intel Skylake hosts. However, scale job control plane nodes need to run on high-performance instances consistently and will be using N2 machine types with Ice Lake CPUs.
However, N2 quotas are not set properly and this issue will track quota failures from k8s-infra-e2e-scale-project-XX and fix them as reported.
- ~k8s-infra-e2e-scale-03 https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/ci-kubernetes-e2e-gci-gce-scalability/1745631298405273600~
- ~k8s-infra-e2e-scale-04 https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/ci-kubernetes-e2e-gci-gce-scalability/1745631298405273600~
- ~k8s-infra-e2e-scale-01 https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/ci-kubernetes-e2e-gci-gce-scalability/1745623497192771584~
- ~k8s-infra-e2e-scale-05~
- ~k8s-infra-e2e-scale-02~
Quotas for E2 CPUs will be bumped to 1000 in us-east1. Please ensure that jobs are running in this location
/sig testing /sig scalability /priority critical-urgent
Related failure in PR: https://github.com/kubernetes/perf-tests/pull/2494 Example run: https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/perf-tests/2494/pull-perf-tests-clusterloader2/1744323040390418432 Project: k8s-presubmit-scale-36
- Quota 'N2_CPUS' exceeded. Limit: 24.0 in region us-east1.
metric name = compute.googleapis.com/n2_cpus
limit name = N2-CPUS-per-project-region
limit = 24.0
dimensions = region: us-east1
projects starting with k8s-*
are part of the google.com org that we don't manage. Please migrate those projects to the community infrastructure.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale
- Close this issue with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten
- Close this issue with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten