kops icon indicating copy to clipboard operation
kops copied to clipboard

Block changing nonMasqueradeCIDR

Open justinsb opened this issue 8 years ago • 10 comments

It does not end well, because service IPs are out-of-range and cannot be changed.

We should either come up with a way to rejig the service IPs, or just prohibit this validation entirely.

justinsb avatar Feb 01 '17 02:02 justinsb

For some additional context here, I discovered that calico is the problem when changing this CIDR in our clusters. It should be possible to change it and do a rolling-update to have all pods come up cleanly on the new network, but for some reason, even if you re-run Calico's config-calico one-time job again, it ADDS the new CIDR to the configuration rather than replacing the old one, and all the entries in etcd for calico's pod assignments stay the same.

blakebarnett avatar Feb 20 '17 19:02 blakebarnett

even if you re-run Calico's config-calico one-time job again, it ADDS the new CIDR to the configuration rather than replacing the old one

Yeah, that's expected. You can still delete the old one, but it's an extra calicoctl command that needs to be run.

and all the entries in etcd for calico's pod assignments stay the same.

A rolling-update of all Pods in the cluster will fix this so long as it's done after adding the new IP Pool and deleting the old one.

It seems reasonable to block changing this on a live cluster. It's going to require re-configuring a number of components and restarting lots of pods, so it's a pretty disruptive operation.

caseydavenport avatar May 29 '17 22:05 caseydavenport

Yeah, I got it to work, doing as you said but it was definitely not a simple/clean process :)

blakebarnett avatar May 30 '17 17:05 blakebarnett

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta. /lifecycle stale

fejta-bot avatar Dec 25 '17 12:12 fejta-bot

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta. /lifecycle rotten /remove-lifecycle stale

fejta-bot avatar Jan 24 '18 13:01 fejta-bot

/lifecycle frozen

chrislovecnm avatar Jan 24 '18 14:01 chrislovecnm

@blakebarnett I know this is very old and so a long shot, but do you perhaps have the steps you went through to make the CIDR change? I'm in the same spot and was wondering how I can make it possible.

gootik avatar Mar 02 '23 19:03 gootik

My memory of it is pretty fuzzy, but I'm pretty sure after doing the change and then removing the old CIDR from the calico configuration, we just did a forced update of all the nodes and things came back online.

blakebarnett avatar Mar 02 '23 19:03 blakebarnett

@blakebarnett Thank you! will give it a go and hope for the best :D. Thanks again

gootik avatar Mar 02 '23 20:03 gootik

As a data point, I had to do this because the default nonMasqueradeCIDR overlaps with Tailscale's IP range (almost exactly, actually) causing pods to not be able to communicate while Tailscale was running. The procedure was painful so I'm going to note it down here for any future travellers who must change their cluster CIDR despite the warnings. This assumes you're running Calico, I haven't tested with other networking plugins (I'm just happy it's running again). You will have downtime.

  • Change nonMasqueradeCIDR to whatever range you need. I set it to 10.244.0.0/16 since I remembered it being a "safe" range from Flannel.
  • kops update cluster --yes
  • Install calicoctl
  • calicoctl get ippool default-ipv4-ippool -oyaml > new-ip-pool.yaml
  • Edit new-ip-pool.yaml to point to the second half of the IP range. I don't know why Calico does this, but it only takes half of the IP range. In my case I set it to 10.244.128.0/17
  • calicoctl delete ippool default-ipv4-ippool
  • calicoctl apply -f new-ip-pool.yaml

At this point you're going to have a cluster that's going to start acting wonky. Press forward.

  • :warning: Nuke your entire cluster: kops rolling-update cluster --yes --cloudonly --force (make sure you have backups! I'd also recommend shutting down any ingresses first so the system stops receiving requests)

  • The cluster will eventually come back up, but you'll notice that the output of things like kubectl -n kube-system get po doesn't reflect reality. This is because kube-controller-manager isn't able to come up because of an error like this: failed to mark cidr[100.64.4.0/24] at idx [0] as occupied for node: i-abcdef0123456: cidr 100.64.4.0/24 is out the range of cluster cidr 10.244.0.0/16 You need to manually remove all the nodes except the new master node (don't worry, your nodes can't join the cluster yet anyway). This will let kube-controller-manager start and try to sync the world back into sanity.

  • Now calico-node will enter into a crash loop because the install-cni container is trying to connect to the old Kubernetes endpoint: 2023-08-06 19:17:52.822 [ERROR][1] cni-installer/<nil> <nil>: Unable to create token for CNI kubeconfig error=Post "https://100.64.0.1:443/api/v1/namespaces/kube-system/serviceaccounts/calico-node/token": x509: certificate is valid for 10.244.0.1, 127.0.0.1, not 100.64.0.1 This is because when you changed the Kubernetes cluster CIDR, the service cluster IPs did not change. Those are very sticky for some reason and I don't know of any good way to "reset" the cluster IP of a service. You will need to manually edit the following services manually to proceed:

    • default/kubernetes
    • kube-system/kube-dns

    You're gonna have to do this by copying the manifest, deleting it via kubectl, and re-applying it with clusterIP and clusterIPs updated to point to the new CIDR.

After all of that, your nodes will join your cluster and everything should start working again. One thing that's interesting to note is that despite services pointing at a stale cluster CIDR, they will still work because kube-proxy is fine routing any IP it seems. Just to be sure, I'd recommend going through kubectl get -A svc and fixing up all the services to get a new IP within the cluster CIDR. Good luck!

sin-ack avatar Aug 06 '23 19:08 sin-ack