terraform
terraform copied to clipboard
Replaced resource does not create when old dependency fails to destroy
Terraform Version
$ terraform version
Terraform v1.9.2
on darwin_arm64
Terraform Configuration Files
resource "terraform_data" "new_root" { count = 1 }
resource "terraform_data" "old_root" {
count = 1
provisioner "local-exec" {
when = destroy
command = "false"
}
}
locals { target = try(terraform_data.old_root[0].id, null) }
resource "terraform_data" "child" {
input = local.target
triggers_replace = [local.target]
}
Debug Output
Terraform Apply 1 - Create necessary resource configuration
Changes to make:
- Change
terraform_data.old_rootcount = 0to simulate resourcedestroyaction - Change
local.targetvalue toterraform_data.new_rootto simulate re-rooting of dependency
diff --git a/main.tf b/main.tf
index b36a2bf..59c41ad 100644
--- a/main.tf
+++ b/main.tf
@@ -1,14 +1,14 @@
resource "terraform_data" "new_root" { count = 1 }
resource "terraform_data" "old_root" {
- count = 1
+ count = 0
provisioner "local-exec" {
when = destroy
command = "false"
}
}
-locals { target = try(terraform_data.old_root[0].id, null) }
+locals { target = try(terraform_data.new_root[0].id, null) }
resource "terraform_data" "child" {
input = local.target
Terraform Apply 2 - Demonstrate no create action when old dependency fails to destroy
Terraform Plan(ish) output
$ terraform apply
terraform_data.old_root[0]: Refreshing state... [id=be057839-2c13-f80d-ebe9-2202ccea391a]
terraform_data.new_root[0]: Refreshing state... [id=3b777302-0856-c378-1d12-d852ca6734e4]
terraform_data.child: Refreshing state... [id=390942a4-b664-9947-142f-eb3260a413c3]
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
- destroy
-/+ destroy and then create replacement
Terraform will perform the following actions:
# terraform_data.child must be replaced
-/+ resource "terraform_data" "child" {
~ id = "390942a4-b664-9947-142f-eb3260a413c3" -> (known after apply)
~ input = "be057839-2c13-f80d-ebe9-2202ccea391a" -> "3b777302-0856-c378-1d12-d852ca6734e4"
~ output = "be057839-2c13-f80d-ebe9-2202ccea391a" -> (known after apply)
~ triggers_replace = [
~ "be057839-2c13-f80d-ebe9-2202ccea391a" -> "3b777302-0856-c378-1d12-d852ca6734e4",
]
}
# terraform_data.old_root[0] will be destroyed
# (because index [0] is out of range for count)
- resource "terraform_data" "old_root" {
- id = "be057839-2c13-f80d-ebe9-2202ccea391a" -> null
}
Plan: 1 to add, 0 to change, 2 to destroy.
Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: yes
Terraform Apply output
terraform_data.child: Destroying... [id=390942a4-b664-9947-142f-eb3260a413c3]
terraform_data.child: Destruction complete after 0s
terraform_data.old_root[0]: Destroying... [id=be057839-2c13-f80d-ebe9-2202ccea391a]
terraform_data.old_root[0]: Provisioning with 'local-exec'...
terraform_data.old_root[0] (local-exec): Executing: ["/bin/sh" "-c" "false"]
╷
│ Error: local-exec provisioner error
│
│ with terraform_data.old_root[0],
│ on main.tf line 4, in resource "terraform_data" "old_root":
│ 4: provisioner "local-exec" {
│
│ Error running command 'false': exit status 1. Output:
╵
Expected Behavior
terraform_data.child depended on terraform_data.old_root, but is switching to depend on terraform_data.new_root.
To me, and from a DAG perspective, I would expect that terraform_data.child to be destroyed, and then it should be able to be created, regardless of what happens to terraform_data.old_root.
Actual Behavior
terraform_data.child is destroyed, which is fine, but it is never created because its old dependency errors.
Steps to Reproduce
See the gists and diff deltas as outlined in the Debug Output section.
Additional Context
This is a dumbed down example of what actually happened. In reality, we experienced this with AWS Route53 Zones and Records and caused a temporary DNS outage. The root reason for the deletion error is due to IAM permission limitations.
While IAM limitations are the root issue and need to be solved, I think this also demonstrates a legitimate issue within the DAG dependency management of Terraform. Unless, is there a specific reason why to choose not to (re)create terraform_data.child when its old dependency experiences a failure?
References
I looked for a while, but I don't know how to correctly express this situation, so I wasn't able find a related issue, though they may be out there. If so, apologies for the duplicate.
Hi @KetchupBomb, thanks for filing this! And, thanks for the easy-to-reproduce example configuration.
One thing to note is the create_before_destroy lifecycle option:
resource "terraform_data" "child" {
// attributes...
lifecycle {
create_before_destroy = true
}
}
This will make Terraform create the new resource before destroying the old one. I just wanted to share the create_before_destroy attribute, as I think it will help you avoid outages in the future. I will investigate the reasoning behind the behaviour you've highlighted, but I suspect there will be some technical reason as to why the ordering happens in the way it does. I think it's likely that the create_before_destroy attribute was introduced because of the destroy ordering behaviour you've highlighted here.
Thanks again!
Thanks for the ACK, @liamcervante. I'm aware of create_before_destroy, and use it in places I know it's needed -- like certain operations on the AWS API which require something to always exist, etc.
I was more concerned with understanding whether or not the general case is purposeful. I generally want to follow default workflows (which is destroy-then-create in Terraform), and using create_before_destroy deviates from default. There are likely unbounded situations where it seems like Terraform's default should work, but it won't given the above situation.
But as long as you see the root argument I've outlined, I leave it to you guys to determine if it's purposeful or if it's a bug. I'd like to subscribe to the answer, though, so if you're willing to share when you find out, I would appreciate it. 🙏
Hi @KetchupBomb,
The order of operations you see here is working as designed, though it is a bit of an awkward case to handle. Destroy actions are still strictly ordered in relation to the dependencies as recorded during the last apply operation, so as far as Terraform is concerned, it must try to delete old_root before replacing child. This is also meant to remain consistent with the order if there were an update to child vs replacement, the action taken on the final resource is meant to happen after the destroy has completed.
I do think a case could be made that the change to the config should break the old dependency (which it already will but only after the first failed apply and the new dependencies are stored). It is highly unlikely that the resource types are tightly coupled enough, that creating the new child might not be able to proceed if the referenced root resource still exists. The place where this does matter though, is when resources downstream from the failed destroy are also depended on by child. In that case however, the existing dependency rules would still block Terraform from proceeding all the way to the child create step, so I don't think we would need to worry about breaking configurations.
To simplify your example, going from this initial config:
resource "terraform_data" "old_root" {
}
resource "terraform_data" "child" {
triggers_replace = terraform_data.old_root.id
}
to this config
removed {
from = terraform_data.old_root
provisioner "local-exec" {
when = destroy
command = "false"
}
}
resource "terraform_data" "child" {
}
Should not block creating the new child resource.
In order to do that, it would take a very specific set of conditions. The create side of a resource replacement, when create_before_destroy is not being used, would not need to depend on a destroy action other than it's own. This could be tricky to implement however, because the graph building process tries to be generalized for all combinations of operations, and detailed inspections like this are not always convenient. This is something we can look into however, both to verify correctness and for feasibility. Thanks!
Use ignore_changes: Add ignore_changes = [terraform_data.old_root.id] to your terraform_data.child resource definition. This tells Terraform to ignore changes in old_root.id and treat it as a constant value. This will force recreation of child regardless of old_root's destruction status.
@Rishav-Roushan-Infrrd (and others in the future), ignore_changes is one such option to "fix" the problem, as is create_before_destroy. Indeed, the real "fix" (for us) is making sure Terraform has the appropriate IAM permissions.
The purpose of this issue was to raise the fact that, from a DAG perspective, there seemed to be a bug in refusing to create a resource when its previous parent failed to destroy. @jbardin outlined above that, while the DAG is important, Terraform also follows ordering based on what happened on the last Terraform apply:
The order of operations you see here is working as designed, though it is a bit of an awkward case to handle. Destroy actions are still strictly ordered in relation to the dependencies as recorded during the last apply operation, so as far as Terraform is concerned, it must try to delete
old_rootbefore replacingchild. This is also meant to remain consistent with the order if there were an update tochildvs replacement, the action taken on the final resource is meant to happen after the destroy has completed.
Using other Terraform features to mask the underlying issue only complicates the discussion. 😅 I was looking for an answer on if the behavior was a bug or intentional. It is intentional.
I leave it to Hashicorp to close/tag this issue appropriately. 👍
Thanks for that clarification, @KetchupBomb. I'll go ahead and close this issue.
Sorry @crw, I wasn't completely clear. While it is the designed behavior, I'm going to reevaluate this particular detail. Trying to change and replace dependencies at the same time is tricky, so smoothing out the process when possible can help make the changes easier for users without requiring a deep understanding of the details.