terraform-provider-libvirt
terraform-provider-libvirt copied to clipboard
Remove artifacts when pool or domain creation fail
When pool or domain creation fail there may be artifacts that were created that cause rerunning the terraform plan against the same host to fail. This fixes a similar problem as described in https://github.com/dmacvicar/terraform-provider-libvirt/issues/739
System Information
Linux distribution
Ubuntu
Terraform version
$ terraform -v
Terraform v1.0.9
on linux_amd64
Provider and libvirt versions
$ terraform-provider-libvirt -version
0.6.11
What artifacts are you referring too?
Do you have an example or sample output of this situation?
The default qemu configuration on my laptop causes domain creation to fail with a Permision denied
error. After fixing the configuration terraform apply
fails with:
╷
│ Error: Error defining libvirt domain: operation failed: domain 'consul_node_2' already exists with uuid 4f3dc245-706e-4065-b075-2a25a9383ee6
│
│ with module.consul-server.libvirt_domain.consul_node[2],
│ on ../modules/consul-libvirt/consul-server.tf line 68, in resource "libvirt_domain" "consul_node":
│ 68: resource "libvirt_domain" "consul_node" {
│
╵
╷
│ Error: Error defining libvirt domain: operation failed: domain 'consul_node_0' already exists with uuid cb944a87-3d95-415e-ac8c-f5a70cf4cb12
│
│ with module.consul-server.libvirt_domain.consul_node[0],
│ on ../modules/consul-libvirt/consul-server.tf line 68, in resource "libvirt_domain" "consul_node":
│ 68: resource "libvirt_domain" "consul_node" {
│
╵
╷
│ Error: Error defining libvirt domain: operation failed: domain 'consul_node_1' already exists with uuid 3c5c5033-d4af-4d2c-92dd-7a55b7b4c21c
│
│ with module.consul-server.libvirt_domain.consul_node[1],
│ on ../modules/consul-libvirt/consul-server.tf line 68, in resource "libvirt_domain" "consul_node":
│ 68: resource "libvirt_domain" "consul_node" {
│
╵
This is because the libvirt provider didn't clean up the domains that encountered permissions errors during creation.
$ virsh list --all
Id Name State
--------------------------------
- consul_node_0 shut off
- consul_node_1 shut off
- consul_node_2 shut off
Running terraform destroy does not delete the domains because they were never added to the terraform state. Undefining the domains using virsh and then running terraform apply works as expected.
So, if with this logic somebody by accident sets the same name of a running workload, creation will fail and we will both destroy and undefine this workload?
So, if with this logic somebody by accident sets the same name of a running workload, creation will fail and we will both destroy and undefine this workload?
Name collision is detected earlier in the resource creation flow (when the xml is defined). This change only cleans up when creation fails not definition, so I don't think there is danger of destroying or undefining something that is not managed by the current terraform config.
I've just also been hit by this, on failure to attach my libvirt guest to a network terraform bailed out by left behind the definition of a virtual machine, thus when I ran terraform apply
again it resulted in a name collision.