kwok icon indicating copy to clipboard operation
kwok copied to clipboard

Internal error occurred: failed to allocate a serviceIP: range is ful

Open alexellis opened this issue 2 years ago • 6 comments

How to use it?

  • [X] kwok
  • [ ] kwokctl --runtime=docker (default runtime)
  • [ ] kwokctl --runtime=binary
  • [ ] kwokctl --runtime=nerdctl
  • [ ] kwokctl --runtime=kind

What happened?

I created a cluster with 220 nodes, then ran an operator that creates a Deployment and a Service per CR, and created 5000 different CRs across 4 namespaces, totalling 20,000 Deployments and Services.

Internal error occurred: failed to allocate a serviceIP: range is full

This happend with only ~ 220-250 services in the whole cluster.

What did you expect to happen?

As per normal K8s, the service IPs are allocated normally. 5k per namespace is well within limits.

How can we reproduce it (as minimally and precisely as possible)?

Write a bash for loop to create > 255 services.

Anything else we need to know?

No response

Kwok version

$ kwok --version
kwok version v0.4.0 go1.20.7 (linux/amd64)

$ kwokctl --version
kwokctl version v0.4.0 go1.20.7 (linux/amd64)

OS version

```console Linux bq 5.15.0-86-generic #96~20.04.1-Ubuntu SMP Thu Sep 21 13:23:37 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

NAME="Ubuntu" VERSION="20.04.6 LTS (Focal Fossa)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 20.04.6 LTS" VERSION_ID="20.04" HOME_URL="https://www.ubuntu.com/" SUPPORT_URL="https://help.ubuntu.com/" BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" VERSION_CODENAME=focal UBUNTU_CODENAME=focal

</details>

alexellis avatar Oct 19 '23 13:10 alexellis

As this error message indicates the ip is exhausted, you can try using MultiCIDRServiceAllocator to allow more ips to be used, or to modify --service-cluster-ip-range=, need to recreate cluster.

# ~/.kwok/kwok.yaml

kind: KwokctlConfiguration
apiVersion: config.kwok.x-k8s.io/v1alpha1
componentsPatches:
- name: kube-apiserver
  extraArgs:
  - key: service-cluster-ip-range
    value: "10.96.0.0/12"
- name: kube-controller-manager
  extraArgs:
  - key: service-cluster-ip-range
    value: "10.96.0.0/12"

wzshiming avatar Oct 20 '23 02:10 wzshiming

Thanks for the reply. Is it possible to use a single /24 CIDR instead with a 192.168.x.x range?

alexellis avatar Oct 23 '23 10:10 alexellis

As the code of apiserver seems impossible.

wzshiming avatar Oct 23 '23 10:10 wzshiming

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Mar 17 '24 08:03 k8s-triage-robot

/remove-lifecycle stale

wzshiming avatar Mar 18 '24 09:03 wzshiming

any update on this issue?

Vivekgaddigi avatar Jun 07 '24 22:06 Vivekgaddigi

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Sep 05 '24 23:09 k8s-triage-robot