packer-plugin-vsphere
packer-plugin-vsphere copied to clipboard
Add support for using unique identifiers to select a network connection in environments where names can be ambiguous.
Community Note
Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request. Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request. If you are interested in working on this issue or have submitted a pull request, please leave a comment.
Description
NSX allows for creating port groups with the same name, even on the same virtual distributed switch. This plugin has a long history of issues with trying to create VMs on a vSphere cluster with such virtual distributed port groups managed by NSX overlays. VMware resolved this in govmomi 0.27 by allowing finder to use other unique identifiers to select a network:
// Network finds a NetworkReference using a Name, Inventory Path, ManagedObject ID, Logical Switch UUID or Segment ID.
// With standard vSphere networking, Portgroups cannot have the same name within the same network folder.
// With NSX, Portgroups can have the same name, even within the same Switch. In this case, using an inventory path
// results in a MultipleFoundError. A MOID, switch UUID or segment ID can be used instead, as both are unique.
// See also: https://kb.vmware.com/s/article/79872#Duplicate_names
// Examples:
// - Name: "dvpg-1"
// - Inventory Path: "vds-1/dvpg-1"
// - ManagedObject ID: "DistributedVirtualPortgroup:dvportgroup-53"
// - Logical Switch UUID: "da2a59b8-2450-4cb2-b5cc-79c4c1d2144c"
// - Segment ID: "/infra/segments/vnet_ce50e69b-1784-4a14-9206-ffd7f1f146f7"
To leverage this, I request the following:
- Update the minimum version of govmomi used by this plugin to at least 0.27.
- Possibly write additional network finder logic to allow users to use one or more of the other unique identifiers for the network, in addition to (or in place of) the network name or inventory path. Segment ID and UUID seem like a reasonable choices here, as they are both fairly readily available from the vCenter UI.
I'm not a GoLang developer, so I'm probably not a great judge of how heavy a lift this would be, but it appears that this may be as simple as just changing the version of govmomi this plugin is built with. The plugin itself looks like it just passes the context and argument straight through to govmomi.Finder.
Use Case(s)
Allow builder to be used in large vSphere environments that provide networking with NSX for scalability and mobility between clusters.
Note The latest release of
vmware/govmomiis 0.29.0.
Note The aforementioned enhancement was released in
vmware/govmomiv0.27.1 which included https://github.com/vmware/govmomi/commit/6209be5b5c0bd5d81078fdc82eb4001f202f90e7.
FWIW: I can't seem to hit on the right set of arguments to make any of these alternate unique identifiers work with govc either. My attempts to test by building this plugin using the latest version of govmomi have also failed (though, admittedly I don't really know what I'm doing when it comes to GoLang).
At a minimum, you'll need to download the source, and from the source tree run:
go get http://github.com/vmware/govmomi
go mod tidy
go build
Then copy the binary to your packer.d/plugins and then run your tests.
I've spoken with the maintainer about us updating to v0.29.0. Ideally, this dependency should generally be done as an isolated chore(deps) pull request.
PR #240 for vmware/[email protected].
Note
hashicorp/[email protected]is now released and includesvmware/[email protected].
@tenthirtyam I saw that earlier today. Unfortunately, this plugin still doesn't seem to be able to find a network by a unique identifier other than its name.
Using the Segment ID:
2022/12/07 02:24:03 [INFO] (telemetry) Starting builder vsphere-iso.linux
2022/12/07 02:24:03 packer-plugin-vsphere_v1.1.1_x5.0_linux_amd64 plugin: 2022/12/07 02:24:03 No URLs were provided to Step Download. Continuing...
2022/12/07 02:24:03 packer-plugin-vsphere_v1.1.1_x5.0_linux_amd64 plugin: 2022/12/07 02:24:03 No CD files specified. CD disk will not be made.
2022/12/07 02:24:03 packer-plugin-vsphere_v1.1.1_x5.0_linux_amd64 plugin: 2022/12/07 02:24:03 No URLs were provided to Step Download. Continuing...
2022/12/07 02:24:03 packer-plugin-vsphere_v1.1.1_x5.0_linux_amd64 plugin: 2022/12/07 02:24:03 No CD files specified. CD disk will not be made.
2022/12/07 02:24:03 ui: ESC[1;32m==> vsphere-iso.linux: Creating VM...ESC[0m
2022/12/07 02:24:03 [INFO] (telemetry) ending vsphere-iso.linux
2022/12/07 02:24:03 ui error: ESC[1;31mBuild 'vsphere-iso.linux' errored after 432 milliseconds 468 microseconds: error creating vm: network '/infra/segments/977eab1d-1670-4b4e-9072-f71038385359' not foundESC[0m
2022/12/07 02:24:03 ui:
==> Wait completed after 435 milliseconds 270 microseconds
Using the MOID:
2022/12/07 02:38:58 [INFO] (telemetry) ending vsphere-iso.linux
2022/12/07 02:38:58 ui error: ESC[1;31mBuild 'vsphere-iso.linux' errored after 350 milliseconds 544 microseconds: error creating vm: network 'DistributedVirtualPortgroup:dvportgroup-16495' not foundESC[0m
I imagine the issue is somewhere in here:
https://github.com/hashicorp/packer-plugin-vsphere/blob/a0992c7396605b33492e7b9447569110e1bb7033/builder/vsphere/driver/network.go#L23-L47
Does this plugin need to pass some additional information to govmomi Finder.Network for this to work correctly?
What kind of additional information can I provide that will help running this down?
Based on a quick review it looks like the plugin should call finder.networkByID based on the network input of an ID vs name.
https://github.com/vmware/govmomi/blob/d99e99542ffe1e054b2da68fac48ee5ce2bd4987/find/finder.go#L823-L856
It looks to me like the finder.Network method already falls back to calling the finder.networkByID method:
https://github.com/vmware/govmomi/blob/17e669d84193839acdbebe6aed5aea26b1c65d48/find/finder.go#L804-L821
This raises some additional questions:
- Why isn't this working in my case?
- How can this project test it?
- Is this an issue with this plugin, or the underlying govmomi library?
That last question comes up because I can't get the search to work with govc either.
It may be a good idea to open a GitHub Discussion item on vmware/govmomi if it appears to also be an upstream concern. It can be converted to an issue if it is a bug.
Note
I pinged one of the
vmware/govmomiwho has kindly commented below. 👇
Are you able to find the network with govc using:
% govc find / -type g -config.segmentId /infra/segments/seg_6e9bdde0-f9bf-4ee6-ac36-493627b6db32_0
/folder-WCP_DC/WCP_DC/network/seg-domain-c9:a97676f3-cf6d-42d7-875b-ae0bd0016e32-test-gc-e2e-demo-ns-0
If so and you add the -i flag, it will print the ManagedObject ID:
% govc find -i / -type g -config.segmentId /infra/segments/seg_6e9bdde0-f9bf-4ee6-ac36-493627b6db32_0
DistributedVirtualPortgroup:dvportgroup-71
Does using the MOID work with the plugin?
@dougm this query has the same issue as searching by name, that is to say it returns multiple results.
govc find / -type g -config.segmentId /infra/segments/b8f015a1-c281-4dfd-abbc-df0c88c5b2a4
/dsc1-w1-dc/network/dsc1-w1-a1-gcib-ix-10.109.248.24_29
/dsc1-w1-dc/network/dsc1-w1-a1-gcib-ix-10.109.248.24_29
/dsc1-w1-dc/network/dsc1-w1-a1-gcib-ix-10.109.248.24_29
With the -i flag, we can see that these each have different MOID values:
govc find -i / -type g -config.segmentId /infra/segments/b8f015a1-c28
1-4dfd-abbc-df0c88c5b2a4
DistributedVirtualPortgroup:dvportgroup-16348
DistributedVirtualPortgroup:dvportgroup-8278
DistributedVirtualPortgroup:dvportgroup-16476
My understanding based on the KB was that segmentId is unique, this is the first I've seen where it isn't. I wonder what is unique (other than moid), can take a look if you can share the output of:
% govc find -i / -type g -config.segmentId /infra/segments/b8f015a1-c281-4dfd-abbc-df0c88c5b2a4 | xargs -n1 govc object.collect -o -json
The error message in this comment is "not found":
network '/infra/segments/977eab1d-1670-4b4e-9072-f71038385359' not found
Based on your govc output, I'd expect the error to be "multiple" found. So I also wonder if the plugin here has govmomi w/ the networkByID fallback. You should be able to confirm but using one of the moid's (e.g. DistributedVirtualPortgroup:dvportgroup-16348)
The error message observed in the previous comment when using MOID DistributedVirtualPortgroup:dvportgroup-16348 was also "not found":
2022/12/07 02:38:58 [INFO] (telemetry) ending vsphere-iso.linux
2022/12/07 02:38:58 ui error: ESC[1;31mBuild 'vsphere-iso.linux' errored after 350 milliseconds 544 microseconds: error creating vm: network 'DistributedVirtualPortgroup:dvportgroup-16495' not foundESC[0m
I may be incorrect, but it might be because addNetwork is using findNetwork - which in turn calls FindNetworks that uses NetworkList
https://github.com/hashicorp/packer-plugin-vsphere/blob/324d9eb8b74d778bc3a97f9aff3931d69f5ab604/builder/vsphere/driver/vm.go#L948-L953
https://github.com/hashicorp/packer-plugin-vsphere/blob/324d9eb8b74d778bc3a97f9aff3931d69f5ab604/builder/vsphere/driver/vm.go#L977-L987
https://github.com/hashicorp/packer-plugin-vsphere/blob/324d9eb8b74d778bc3a97f9aff3931d69f5ab604/builder/vsphere/driver/network.go#L34-L47
I may be incorrect, but it might be because
addNetworkis usingfindNetwork- which in turn callsFindNetworksthat usesNetworkList
Yes, looks like that is the issue. We can change govmomi's NetworkList to do the networkByID fallback. Or the plugin could fallback to calling Network if list fails.
Thanks Doug - appreciate the assist here. I'll work with the maintainer and get a fix in for this in the plugin to use the networkByID fallback.
I'm setup to test new plugin builds, if you guys can get me some PoC code.
I take it this is still backlogged ?
I revisited this one this evening and did some tests on latest (v1.2.7) and didn't have any issues using the MOIDs for port groups (e.g. "Network:network-18085" or distributed port groups (e.g. "DistributedVirtualPortgroup:dvportgroup-22077") both of which had the same name and would error if just the name was used.
==> vsphere-iso.linux-photon: error creating virtual machine: path 'DHCP' resolves to multiple networks. please provide a host to match or the network full path
When using the MOIDs, the build is placed on the correct port group or distributed port groups without issue. I've not verified this with an NSX segment yet, but it should have the same results.
I was going to add the failback, as seen below, but it appears not to be needed...
func (d *VCenterDriver) FindNetworks(name string) ([]*Network, error) {
ns, err := d.finder.NetworkList(d.ctx, name)
if err != nil || len(ns) == 0 {
n, err := d.finder.Network(d.ctx, name)
if err != nil {
return nil, err
}
return []*Network{
{
network: n,
driver: d,
},
}, nil
}
var networks []*Network
for _, n := range ns {
networks = append(networks, &Network{
network: n,
driver: d,
})
}
return networks, nil
}
Why, because of https://github.com/vmware/govmomi/pull/2626 (@dougm is awesome! 🎉 ) added the failback (see https://github.com/vmware/govmomi/pull/2626/commits/bb4f739b451eefa1261f5c20df1ec7dc14621e8c#) that was included in v0.31.0 of vmware/govmomi and was picked up in v1.2.3 of the plugin.
I'm going to close this issue, however, I will add a PR to update the duplicate networks error message to instead suggest using the ID or path of the network instead of only "a host to match or full path".
Ryan
@tenthirtyam I'd feel a lot better about this if it was tested with a NSX segment before closing this. I'll see if I can get a test in later today or on Monday.
To clarify: I'm @taylor-madeak, just created a separate GitHub account for work stuff (which this issue relates to).
I've successfully tested this with the both the segment id and logical switch uuid using release v1.2.7 on VMware Cloud Foundation 5.1.1 BOM.
Ryan Johnson Distinguished Engineer, VMware by Broadcom
@tenthirtyam I'm still having some trouble getting a successful test for this in our VCF environment, where I'm not guaranteed to land on any one specific VM host in the cluster. Can you share which vsphere-iso source properties you're specifying when you test this feature? I'd like to verify that it's not just a template configuration issue on my part.
Is your use case to always use the same host and a specific network on that host?
The opposite, actually. My current template specifies server, datacenter, and cluster. I'd like to continue not caring which host I end up on and still be able to get a network. I'm not an expert with NSX, but it appears that the overlays end up being associated with VM hosts in vCenter. So, by not specifying a host to build on, the distributed portgroup MOID or segment ID I specify isn't found by Packer.
Hey! If you'd like to take a look at this live let me know. You can email me [email protected] and we can schedule some time to look at this.
Ryan Johnson VMware by Broadcom
@taylor-madeak - wanted to check in and see if you've had an opportunity to test with the latest. Please feel free to reach out at ryan.johnson [at] broadcom [dot] com if you would like to look at this live.
Ryan