cloud-provider-vsphere icon indicating copy to clipboard operation
cloud-provider-vsphere copied to clipboard

Windows Node SystemUUID

Open rhockenbury opened this issue 5 years ago • 12 comments

Is this a BUG REPORT or FEATURE REQUEST?:

/kind feature

The out-of-tree VCP depends on having systemUUID set on the node object. The systemUUID property does not get set on Windows - https://github.com/kubernetes/kubernetes/issues/75978

This may prevent the windows kubelet from starting when VCP is enabled. Although, the vSphere CSI won't work on windows, it would be nice to be able to run the out-of-tree VCP on windows to have it set the topology tags.

rhockenbury avatar Jun 15 '19 20:06 rhockenbury

@rhockenbury As soon as that is implemented in k/k for Windows, the vSphere CCM should automatically work without any code modification.

dvonthenen avatar Jun 18 '19 18:06 dvonthenen

Should be resolved with the 1.16 release - https://github.com/kubernetes/kubernetes/pull/80486

rhockenbury avatar Jul 24 '19 04:07 rhockenbury

/kind feature /lifecycle frozen

frapposelli avatar Aug 05 '19 14:08 frapposelli

Just a general FYI. The PR that implements Windows UUID was merged upstream... In theory, this should just work using the latest release of the CPI. Unfortunately, I have no means of testing this.

dvonthenen avatar Aug 29 '19 20:08 dvonthenen

This looks like a testing effort and currently we have no CI or testing environment with Windows.

@rhockenbury do you know if there's any Windows testing infra we can leverage for this?

frapposelli avatar Sep 04 '19 16:09 frapposelli

I suspect there is - I believe @benmoss had set up verification tests for windows nodes on vsphere. Hopefully, he can chime in on the state of this, what could be leveraged, and the level of effort.

On a related note, there's also been a lot of recent work on the windows csi-proxy which would enable running the vsphere-csi-driver on windows. Unsure of what (if any) test infra has been proposed for testing that.

Certainly feels like it could be beneficial for both sig-windows and sig-vmware to discuss what's needed going forward to run tests on windows for vsphere cloud providers (in-tree, out-of-tree), the vsphere-csi-driver, and the windows csi-proxy. @PatrickLang and @michmike are probably the best to help coordinate this effort.

rhockenbury avatar Sep 04 '19 17:09 rhockenbury

I have a CI system that deploys a cluster with a Windows node and runs conformance tests against it, the results are posted here. The CI isn't accessible to the public internet and isn't super easily portable, it's a rather complicated setup that uses BOSH and kubo to deploy the cluster.

I'm open to ideas on how we could make something more portable, but I don't know much about how you currently test vSphere functionality.

benmoss avatar Sep 04 '19 19:09 benmoss

I have no idea what I am looking other than a lot of red 🙃

A couple of things I would look for in the logs would be that the node name and internal/external hostname/ips are being set correctly. If zones are being used, then the zone/region labels are properly being populated.

dvonthenen avatar Sep 06 '19 14:09 dvonthenen

@benmoss thanks for chiming in, our current testing rig is running on vSphere on AWS (a.k.a. VMC) that VMware is sponsoring, it is triggered by Prow and we have a set of presubmits and postsubmits that test the cloud provider functionalities end-to-end.

If there is a way for us to trigger a pre and post submit job on your infra we could have a way to test changes on Windows, do you think that's doable?

frapposelli avatar Sep 06 '19 14:09 frapposelli

Yeah, the Windows tests tend to be quite flaky in my experience. I haven't had time to debug the problems, I know others have recommended not running tests in parallel to avoid this flakiness but that's never sat well with me. Even with that the other builds from Microsoft and Google are still pretty flaky.

My setup isn't through Prow, it's a custom CI pipeline. It'd take some work to get it to build/deploy arbitrary branches/forks. I don't think that it's very sustainable to run it this way.

Kubeadm Windows support is going to be in alpha with Kubernetes 1.16, maybe we can figure out a way to use that as part of the Prow deploys.

benmoss avatar Sep 06 '19 15:09 benmoss

@benmoss was there any progress on getting the tests running through prow?

frapposelli avatar Oct 02 '19 16:10 frapposelli

No, I don't have the bandwidth for this right now. I know there is some talk of getting kubeadm test signal for Windows clusters, maybe we can piggyback on their work when it is complete: https://github.com/kubernetes-sigs/sig-windows-tools/issues/14

benmoss avatar Oct 02 '19 17:10 benmoss