ops
ops copied to clipboard
vsphere ops image create is unreliable
I've discovered that ops image create for vsphere can be unreliable. The FileUpload and Copy steps in CreateImage will fail some 30-50% of the time. They currently are not checked for errors, but since they can fail, so it would be nice to have error messages for those. The /var/log/hostd.log doesn't give much clarity on the errors either. I added an error checks and these are the errors I can get (rather randomly):
- The Copy step can fail with "Invalid datastore path '/vmfs/volumes/5fa1863a-10f1d078-3add-0010e0236ce4/webg/webg2.vmdk'" even thought that appears to be a valid destination path (webg2.vmdk doesn't exist yet tho)
- The Upload step for the flat file can fail with "404 Not Found" (only the directory gets created)
- Occasionally I see a third error about the file being too large, but I wasn't able to reproduce it for this to get the exact message.
I'm seeing this on a standalone ESXi server, 6.7.0 Update 3 (Build 15160138), no vCenter. The errors seem to be ESXi fault, but it would be nice to make this more robust if possible
I have noticed similar errors. I think the problem could be with multiple hosts via VCenter. For example, when it starts instance, the console.log could be on one of the available hosts. and when you try to run logs command, it will error with 500 because it might be checking on different host. as work around, I can access logs by using govc datastore.tail and exporting GOVC_HOST. Also it doesn't delete instance_1/console.log datastore when we delete instance, so next time it launches it creates instance_2/console.log. There is some work needed here to have clean