terraform-provider-vcd
terraform-provider-vcd copied to clipboard
Destroy of VM is powering off vApp needlessly (causes instability with multiple VMs)
Terraform Version
Terraform v0.11.7
Affected Resource(s)
- vcd_vapp_vm
- vcd_vapp
Terraform Configuration Files
Define any configuration with a vApp and two VMs.
Panic Output
vcd_vapp_vm.TerraformVM9: Still destroying... (ID: TerraformVM9, 8m10s elapsed)
Error: Error applying plan:
2 error(s) occurred:
* vcd_vapp_vm.TerraformVM7 (destroy): 1 error(s) occurred:
* vcd_vapp_vm.TerraformVM7: error Undeploying vApp: &errors.errorString{s:"error Undeploying: &errors.errorString{s:\"error undeploy vApp: API Error: 400: [ 1b7736da-b79b-4835-b798-7ab95f526778 ] The requested operation could not be executed since vApp \\\"TerraformVApp\\\" is not running.\"}"}
* vcd_vapp_vm.TerraformVM9 (destroy): 1 error(s) occurred:
Steps to Reproduce
- Apply a configuration with a vApp containing two VMs.
- Open up HTML5 tenant UI for the given org.
- Destroy the configuration and watch the recent tasks section of the tenant UI.
Expected Behavior
When destroying VMs, vApp's power on/off state should be left as is.
Actual Behavior
As can be seen in the HTML5 tenant UI, destruction of each VM in a vApp first does a power off of the vApp and afterwards a power on. First of all this is needless - you can delete a VM in a running vApp with no issues. Secondly, this is leads to all kinds of instabilities, like the error in the output above.
Important Factoids
Though we have ideas of reducing all operations on a vApp to run serially, as we see it's difficult for the underlying vApp to handle operations in parallel. I feel we should try removing this obvious needless power off/on cycling during VM deletion as it may be an easy fix for improving stability substantially.
CC: @Didainius
Another example, where destroy failed with what seems an out-of-place error:
vcd_vapp_vm.TerraformVM8: Still destroying... (ID: TerraformVM8, 1m20s elapsed)
vcd_vapp_vm.TerraformVM7: Still destroying... (ID: TerraformVM7, 1m10s elapsed)
vcd_vapp_vm.TerraformVM7: Destruction complete after 1m12s
vcd_independent_disk.TerraformDisk: Destroying... (ID: tf-disk)
vcd_vapp_vm.TerraformVM8: Still destroying... (ID: TerraformVM8, 1m30s elapsed)
vcd_independent_disk.TerraformDisk: Destruction complete after 5s
vcd_vapp_vm.TerraformVM8: Still destroying... (ID: TerraformVM8, 1m40s elapsed)
vcd_vapp_vm.TerraformVM8: Still destroying... (ID: TerraformVM8, 1m50s elapsed)
...
vcd_vapp_vm.TerraformVM8: Still destroying... (ID: TerraformVM8, 6m0s elapsed)
vcd_vapp_vm.TerraformVM8: Still destroying... (ID: TerraformVM8, 6m10s elapsed)
vcd_vapp_vm.TerraformVM8: Still destroying... (ID: TerraformVM8, 6m20s elapsed)
Error: Error applying plan:
1 error(s) occurred:
* vcd_vapp_vm.TerraformVM8 (destroy): 1 error(s) occurred:
* vcd_vapp_vm.TerraformVM8: error deleting: &errors.errorString{s:"error instantiating a new vApp: API Error: 400: [ 4fe5a8a3-380c-4bc7-b5d2-cc7ed2c373d0 ] The requested operation could not be executed on VM \"TerraformVM8\". Stop the VM and try again."}
Looking at this issue from the VM standpoint:
(1) VMs deployed
(2) start removal
(3) all VMs are powered off
(4) One VM removed. The other two are needlessly powered on
(5) All VM powered off
(6) One VM removed
(7) the last VM is powered off
(8) The last VM is powered on again, before being removed
With Today's improvement to handling parallel calls with mutexes #255 , the speed impact of these power on/off calls during destroy become really obvious. In effect, VMs creation may even be faster than destroy! See snippet below.
apply
speed:
vcd_vapp_vm.tf_vm7: Creation complete after 1m37s [id=TerraformVM7]
vcd_vapp_vm.tf_vm9: Creation complete after 3m26s [id=TerraformVM9]
destroy
speed:
vcd_vapp_vm.tf_vm7: Destruction complete after 5m31s
vcd_vapp_vm.tf_vm9: Destruction complete after 3m8s