terraform-provider-vsphere
terraform-provider-vsphere copied to clipboard
`vsphere_vm_storage_policy` not deleted after `terraform destroy`
Terraform Version
v1.0.0
vSphere Provider Version
v2.0.1
Affected Resource(s)
vsphere_vm_storage_policy
Terraform Configuration Files
provider "vsphere" {
user = var.vsphere_user
password = var.vsphere_password
vsphere_server = var.vsphere_server
allow_unverified_ssl = true
}
data "vsphere_tag_category" "policy_category" {
name = var.storage_policy_tag_category
}
data "vsphere_tag" "policy_tag_include" {
name = var.policy_tag
category_id = data.vsphere_tag_category.policy_category.id
}
resource "vsphere_vm_storage_policy" "policy_tag_based_placement" {
name = "kube_test"
description = "This storage policy is managed by Terraform. It's used for the vSphere CSI StorageClass (in Kubernetes) for Persistent Volumes"
tag_rules {
tag_category = data.vsphere_tag_category.policy_category.name
tags = [ data.vsphere_tag.policy_tag_include.name ]
include_datastores_with_tags = true
}
}
Debug Output
https://gist.github.com/Calvinaud/2ab056d3f8a102b585b403ee419b8450
Panic Output
No panic Output
Expected Behavior
The terraform destroy should panic since the storage policy is still used (or delete it).
When you try to remove the policy manually in vCenter you have the error message:
Delete VM Storage Policy failed!
The resource 'xxxx-xxx-...' is in use.
Actual Behavior
The terraform destroy execute without errors but do not delete the storage policy.
It also remove the Storage Policy in the tfstate even if the resource still exist.
Steps to Reproduce
terraform applyto create the storage policy- In a Kubernetes cluster install vSphere CPI and vSphere CSI
- Create a StorageClass using the storage policy in k8s
- Create a PVC in Kubernetes with the StorageClass
terraform destroyto try destroying the storage policy
(Also possible to reproduce without Kubernetes by creating a resource not managed by terraform that use the storage policy created)
Important Factoids
Vsphere CSI version: v2.1.1 Kubernetes version: 1.19.7 and 1.20.7 (I was able to reproduce in both version) vSphere version: 6.7u3
Community Note
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment
Hi @Calvinaud, I suspect that this would be the expected behavior of the provider.
If you are running a destroy on the plan you would be requesting to remove the resource from vSphere (if possible) and therefore remove the resource from the state. In this instance, since the storage policy is in use by another object (perhaps outside of this plan, in another plan, or other) the removal of the remaining resources proceeds along with the removal of the policy from the state to reach the desired state. However, it may be more ideal to log a similar warning in Terraform if the object is in use by external entities.
@appilon, could you confirm if this is the expected behavior?
Ryan
Naively, I would expect the provider to error if it was unable to delete something upstream (would expect to get an error from vsphere API/govmomi) and have that bubble up on destroy, and yes, it would then not be a successful destroy/would remain in state. I will have to look into the resource code.
However, @Calvinaud , creating something with terraform and then attaching it to something out of band that prevents terraform to have authority over it going forward isn't an ideal practice. I suspect if I did fix a silent failure, what will happen is you will have to delete the object out of terraform's scope and then run terraform destroy again, and possibly have to run terraform state rm to manually remove from state (depending on the resource code, which I need to look into). Worth a follow up.
Hi @appilon and @tenthirtyam, thanks for the response.
I agree on the fact that Terraform can loose authority on a resource it's not ideal. But in our cases the binding is done by something way out of scope from Terraform/the infrastructure and we can do nothing about it. (or if you have a idea to get around it I will gladly take it but it's not the subject of this issue).
That why, in our procedure, we normally need to uninstall the Kubernetes cluster before running the terraform destroy.
The main problem is the possible divergence between the reality and the state. The most blocking problem in our use cases is when we try to recreate the infrastructure. It crashes because he tries to recreate a resource that already exist.
A question by pure curiosity: Do you think it should panic during the planning phase of the destroy or still destroy the other resources and just keep this one in the state?
Have a nice day
re:
Do you think it should panic during the planning phase of the destroy or still destroy the other resources and just keep this one in the state?
I agree with Alex's points here in the thread.
It certainly should not panic during a terraform plan. It should certainly destroy what it's able to control based on state and it the resource is not used by another resource (one denied by Terraform or natively in vSphere). In an ideal state, it would be great if the plan could detect the resource being in use by "other forces", and provide the option to continue, but removed only from state.
Ryan
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.