cluster-api-provider-vsphere icon indicating copy to clipboard operation
cluster-api-provider-vsphere copied to clipboard

Document how to use static-ips for workload clusters

Open omniproc opened this issue 4 years ago • 15 comments
trafficstars

Describe the solution you'd like VSphereMachineTemplate and VSphereMachine both seem to support providing static IP adresses. From older issues reported here it seems like a list of IPs could be provided to the VSphereMachineTemplate so all MachineDeployments use one of those IPs until none is left. However there is no documentation on how exactly to do this and at least with the latest vSphere Provider v0.7.9 running on v1alpha3 I wasn't able to create a working configuration.

VSphereMachineTemplate.infrastructure.cluster.x-k8s.io "capi" is invalid: spec.template.spec.network.devices.ipAddrs: Forbidden: cannot be set in templates is returned by clusterctl when trying to set ipAddrs and it's unclear how to edit the YAML generated by clusterctl so machines are deployed using static IPs.

Environment:

  • Cluster-api-provider-vsphere version: 0.7.9
  • Kubernetes version: 1.21.1
  • OS: Ubuntu 20.04.2 LTS

omniproc avatar Jul 16 '21 16:07 omniproc

Hm, ok. So besides the missing documentation I believe this is a bug.

https://github.com/kubernetes-sigs/cluster-api-provider-vsphere/blob/v0.7.9/api/v1alpha3/vspheremachinetemplate_webhook.go#L51-L54

Browsing through older issues it seems like this feature already was working some time ago so I'm not sure if this is a new bug or a missing feature.

omniproc avatar Jul 16 '21 18:07 omniproc

After some further investigation it seems like there's currently no way to specify static ips for the machines directly in the manifest used to deploy a workload cluster. It seems more like a current feature limitation (see the mentioned validator hook above).

So in order to use static ips the only method I was able to find so far is:

  • Use a VSphereMachineTemplate and configure dhcp4 and dhcp6 to false
  • This will cause any VSphereMachine CRD created from this template to wait until the VSphereMachine has its ipAddrs set
  • Use kubectl patch to edit the created VSphereMachine. Once it has it's ipAddrs the KubeadmControlPlane controller VM will be deployed in vSphere

It doesn't seem to be possible to create the VSphereMachineCRDs directly and tell the KubeadmControlPlane to use that object. Instead you have to provide a VSphereMachineTemplate as infrastructureTemplate in the KubeadmControlPlane which comes with the mentioned limitations.

So for now I'm using this to somewhat automate patching the CRD's to my needs:

# Declare the cluster name
declare cluster=cls01
# Declare clusterip and node ips
declare clusterip="172.16.0.30/32"
declare -a ips=("172.16.0.26/24" "172.16.0.27/24" "172.16.0.28/24" "172.16.0.29/24")

# Select controller VM
cvm=$(kubectl get VSphereMachine -l cluster.x-k8s.io/cluster-name=$cluster -o json | jq -r '.items[] | select(.metadata.ownerReferences[] | select(.kind=="KubeadmControlPlane")) | .metadata.name')
uid=$(kubectl get VSphereMachine $cvm --template '{{.metadata.uid}}{{"\n"}}')
echo $cvm

# Patch the controller VM to have a node ip and the clusterip
kubectl patch VSphereMachine $cvm --type=merge -p '{"spec":{"network":{"devices":[{"networkName": "lan", "gateway4": "172.16.0.1", "nameservers":["172.16.0.1"], "ipAddrs": ["'${ips[0]}'","'$clusterip'"]}]}}}'

# Get a list of all worker VMs
readarray -t vms < <(kubectl get VSphereMachine -l cluster.x-k8s.io/cluster-name=$cluster -o json | jq -r --arg uid "$uid" '.items[] | select(.metadata.uid!=$uid) | .metadata.name')

# Patch the controller VMs to have a node ip
count=1
for i in "${vms[@]}"
do
   kubectl patch VSphereMachine $i --type=merge -p '{"spec":{"network":{"devices":[{"networkName": "lan","gateway4": "172.16.0.1", "nameservers":["172.16.0.1"], "ipAddrs": ["'${ips[$count]}'"]}]}}}'
   (( count++ ))
done

I'd still leave this flagged as bug since the VSphereMachineTemplate definition clearly states that ipAddrs is a valid attribute but, as shown in the linked code above, the current VSphere Provider for ClusterAPI seems to just ignore that and throw an error.

omniproc avatar Jul 16 '21 20:07 omniproc

@yastij I am also following this thread with interest and noticed you had assigned the issue to yourself for follow-up. Are you able confirm/comment on the findings of @omniproc ? Is there any method we are missing or is static ip support for workload clusters not present at this time? Thanks!

nwoodmsft avatar Aug 05 '21 00:08 nwoodmsft

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Nov 03 '21 00:11 k8s-triage-robot

/lifecycle frozen

omniproc avatar Nov 03 '21 11:11 omniproc

/help Now that the bug has been fixed, this can be something that can be picked up. @omniproc something you would wanna take a stab at?

srm09 avatar Jan 31 '22 00:01 srm09

@srm09: This request has been marked as needing help from a contributor.

Guidelines

Please ensure that the issue body includes answers to the following questions:

  • Why are we solving this issue?
  • To address this issue, are there any code changes? If there are code changes, what needs to be done in the code and what places can the assignee treat as reference points?
  • Does this issue have zero to low barrier of entry?
  • How can the assignee reach out to you for help?

For more details on the requirements of such an issue, please see here and ensure that they are met.

If this request no longer meets these requirements, the label can be removed by commenting with the /remove-help command.

In response to this:

/help Now that the bug has been fixed, this can be something that can be picked up. @omniproc something you would wanna take a stab at?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Jan 31 '22 00:01 k8s-ci-robot

/kind documentation /remove-kind bug

srm09 avatar Jan 31 '22 00:01 srm09

@srm09 I can write a docs draft for the quickstart.md on how to assign static IPs if that's what you're asking.

omniproc avatar Jan 31 '22 09:01 omniproc

That would be very helpful.😃😃

srm09 avatar Jan 31 '22 19:01 srm09

Sure. I'll schedule some time for it this weekend. Will link the PR to this issue when done.

omniproc avatar Jan 31 '22 21:01 omniproc

/unassign @yastij /assign @omniproc

srm09 avatar Jan 31 '22 22:01 srm09

Before writing a PR doc, could someone please provide a quick example here? :pray:

EDIT:

  • https://github.com/kubernetes-sigs/cluster-api-provider-vsphere/issues/1215#issuecomment-1025266432 said bug is fixed, but which bug?
  • is there another way to do this than the workaround in https://github.com/kubernetes-sigs/cluster-api-provider-vsphere/issues/1215#issuecomment-881689524

EDIT2:

Could a preKubeadmCommands be used?

sathieu avatar Apr 02 '22 06:04 sathieu

@sathieu I'm not sure but I doubt it since the preKubeadmCommand is executed after the node is reachable (thus, has a IP). The management cluster communicates the IPs to vSphere before the preKubeadmCommand is executed on the workload cluster for what I know (but as I said, I'm not sure - would need to test).

You might want to take a look at https://github.com/spectrocloud/cluster-api-provider-vsphere-static-ip or https://github.com/telekom/das-schiff/tree/ipam/ipam which provide custom controllers for a IPAM solution. It's the better approach and only requires you to deploy a custom controller on your management cluster. Under the hood pretty much the same happens as shown in the bash snipped above.

omniproc avatar Apr 02 '22 12:04 omniproc

Thanks @omniproc, I'll use the hack short term ; then setup a DHCP server.

sathieu avatar Apr 06 '22 09:04 sathieu