eks-anywhere
eks-anywhere copied to clipboard
govc tasks should be internally optional
When running eksctl anywhere create on a system where the tags (e.g. os:bottlerocket) or the tag catgories (e.g. os) already exist, the whole installation errors out. Rerunning it will succeed (though only after the user manually assigns the tags to the template to get the installer to proceed). Either the existence should be tested for explicitly in the workflow and actions taken accordingly, or the error handling be improved so that the workflow can continue.
I think there will be some resistance to this. You are asking that if the OVA is uploaded, EKS-A should just assume it the correct OS and k8s type? I'm not sure how it can make that assumption.
An image upload command might be handy though that creates the tags.
Sorry if I didn't manage to be clear enough. I'm not talking about the OVA, I'm talking about the tags and tag categories themselves. On the vSphere side, these are independent objects that are created by themselves and then associated with e.g. templates like in this case. The problem at hand is that eksctl anywhere goes into this with the assumption that the tags (and categories) don't already exist. If they do, you get something like:
Connected to server ✅ Authenticated to vSphere ✅ Datacenter validated ✅ Network validated ✅ Datastore validated ✅ Folder validated ✅ Resource pool validated ✅ Datastore validated ✅ Folder validated ✅ Resource pool validated ✅ Datastore validated ✅ Folder validated ✅ Resource pool validated Creating template. This might take a while. ❌ Validation failed {"validation": "vsphere Provider setup is valid", "error": "failed creating category for tags: govc returned error when creating category eksdRelease: govc: 400 Bad Request: {"type":"com.vmware.vapi.std.errors.already_exists","value":{"error_type":"ALREADY_EXISTS","messages":[]}}\n", "remediation": ""} Error: failed to create cluster: validations failed
And then the whole process exits out. The template itself has been imported, but it has no tags. If you run the command again, the process isn't idempotent; the next run will detect that the template is there but will fail the validation of the template since it doesn't have any tags:
✅ Connected to server ✅ Authenticated to vSphere ✅ Datacenter validated ✅ Network validated ✅ Datastore validated ✅ Folder validated ✅ Resource pool validated ✅ Datastore validated ✅ Folder validated ✅ Resource pool validated ✅ Datastore validated ✅ Folder validated ✅ Resource pool validated ❌ Validation failed {"validation": "vsphere Provider setup is valid", "error": "template /Homelab/vm/Templates/bottlerocket-v1.21.2-kubernetes-1-21-eks-4-amd64-a440064 is missing tag os:bottlerocket", "remediation": ""} Error: failed to create cluster: validations failed
or
✅ Connected to server ✅ Authenticated to vSphere ✅ Datacenter validated ✅ Network validated ✅ Datastore validated ✅ Folder validated ✅ Resource pool validated ✅ Datastore validated ✅ Folder validated ✅ Resource pool validated ✅ Datastore validated ✅ Folder validated ✅ Resource pool validated ❌ Validation failed {"validation": "vsphere Provider setup is valid", "error": "template /Homelab/vm/Templates/bottlerocket-v1.21.2-kubernetes-1-21-eks-4-amd64-a440064 is missing tag eksdRelease:kubernetes-1-21-eks-4", "remediation": ""} Error: failed to create cluster: validations failed
The only way around this seems to be that once the first template creation has failed, you go to vCenter and manually assign the two (existing) tags, and then run the create cluster again.
This is actually a valid issue. Thanks for reporting this. We are looking to making govc calls more robust, and ease the user experience.
Govc calls that eks-a makes should be more transactional, at least in sectioning them into waypoints during the auto-import/tagging process will help with users re-running if any step were to fail. For example this this case, auto-import process failed while trying to create tags, we should either rollback by deleting the template that was imported, or tag the right template referring to the manifest instead of validation failing on missing tag on subsequent runs.
This should be resolved when #344 merges and available shortly in the next release.
Ill leave this one open to represent adding a flag for power users to skip certain "preflights", similar to kubeadm.
❌ Validation failed {"validation": "vsphere Provider setup is valid", "error": "failed importing template into library: error importing template: govc: The import of library item ca7883df-361a-490a-bdd5-5c72b3c3ebe4 has failed. Reason: Error transferring file bottlerocket-v1.21.2-eks-d-1-21-6-eks-a-4-amd64.ova to ds:///vmfs/volumes/618a3174-5914274c-d16f-ac1f6b1048b8//contentlib-f8ba9922-cec0-402d-95fe-192510114fb7/ca7883df-361a-490a-bdd5-5c72b3c3ebe4/bottlerocket-v1.21.2-eks-d-1-21-6-eks-a-4-amd64_fbb35db4-9090-4bdc-b786-6a000830c64d.ova?serverId=9133c167-f659-49bb-beca-35bcf5c160ed. Reason: Error during transfer of ds:///vmfs/volumes/618a3174-5914274c-d16f-ac1f6b1048b8//contentlib-f8ba9922-cec0-402d-95fe-192510114fb7/ca7883df-361a-490a-bdd5-5c72b3c3ebe4/bottlerocket-v1.21.2-eks-d-1-21-6-eks-a-4-amd64_fbb35db4-9090-4bdc-b786-6a000830c64d.ova?serverId=9133c167-f659-49bb-beca-35bcf5c160ed: IO error during transfer of ds:/vmfs/volumes/618a3174-5914274c-d16f-ac1f6b1048b8/contentlib-f8ba9922-cec0-402d-95fe-192510114fb7/ca7883df-361a-490a-bdd5-5c72b3c3ebe4/bottlerocket-vmware-k8s-1.21-x86_64-1.3.0-395b459c-data_fbb35db4-9090-4bdc-b786-6a000830c64d.vmdk: Pipe closed.\n", "remediation": ""}
This error didn't pass until I manually created a VM folder named Templates
❯ eksctl anywhere version
v0.6.0
vSphere version 7.0.3
There has been no activity on this issue for 60 days. Labeling as stale and closing in 7 days if no further activity.