terraform
terraform copied to clipboard
Circular dependency when using managed resources in a provider config
Creating a kubernetes cluster and using that resource to set up a kubernetes provider and create a pod seems to create a circular dependency.
Initial apply works, but when a change forces replacement of the pod, forced destroy fails.
Terraform Version
Terraform v1.1.4
on linux_amd64
+ provider registry.terraform.io/hashicorp/azuread v2.16.0
+ provider registry.terraform.io/hashicorp/azurerm v2.91.0
+ provider registry.terraform.io/hashicorp/kubernetes v2.7.1
+ provider registry.terraform.io/hashicorp/random v3.1.0
and
Terraform v1.1.5
on linux_amd64
+ provider registry.terraform.io/hashicorp/azuread v2.16.0
+ provider registry.terraform.io/hashicorp/azurerm v2.91.0
+ provider registry.terraform.io/hashicorp/kubernetes v2.7.1
+ provider registry.terraform.io/hashicorp/random v3.1.0
Terraform Configuration Files
resource "azurerm_kubernetes_cluster" "services" {
name = "service-aks1"
location = data.azurerm_resource_group.resource_group.location
resource_group_name = data.azurerm_resource_group.resource_group.name
dns_prefix = "aks1"
default_node_pool {
name = "default"
node_count = 1
vm_size = "Standard_D2_v2"
}
identity {
type = "SystemAssigned"
user_assigned_identity_id = azurerm_user_assigned_identity.aks_identity.id
}
tags = {
Environment = var.environment_name
}
}
provider "kubernetes" {
host = azurerm_kubernetes_cluster.services.kube_config.0.host
username = azurerm_kubernetes_cluster.services.kube_config.0.username
password = azurerm_kubernetes_cluster.services.kube_config.0.password
client_certificate = base64decode(azurerm_kubernetes_cluster.services.kube_config.0.client_certificate)
client_key = base64decode(azurerm_kubernetes_cluster.services.kube_config.0.client_key)
cluster_ca_certificate = base64decode(azurerm_kubernetes_cluster.services.kube_config.0.cluster_ca_certificate)
}
resource "kubernetes_pod" "test" {
metadata {
name = "terraform-example2"
}
spec {
container {
image = "nginx:1.7.9"
name = "example"
}
}
}
...
Expected Behavior
pod should be updated/replaced
Actual Behavior
│ Error: Cycle: module.infrastructure.module.kubernetes.kubernetes_pod.test (destroy), module.infrastructure.module.kubernetes.azurerm_kubernetes_cluster.services, module.infrastructure.module.kubernetes.provider["registry.terraform.io/hashicorp/kubernetes"]
Steps to Reproduce
terraform initterraform apply- some changes to the pod that force replacement
terraform apply
Hi @fnordian,
Thanks for filing the issue. The configuration here isn't showing any relationship from the azurerm_kubernetes_cluster to the kubernetes provider. The cycle reported may have been reduced to the minimum size, but could still be due to resources not listed from portions of the configuration which have been left out. If you are unsure where the cycle is arising, can you supply a more complete example of the configuration?
Thanks!
The relationship comes through the provider's parameters, doesn't it?
e.g.
host = azurerm_kubernetes_cluster.services.kube_config.0.host
The dependency in that direction should not result in a cycle on its own, though it could be something to investigate. I suspect however there are other resource at play which may have contributed to the error.
I would like to note here that the configuration as shown is not a recommended pattern, as you are passing managed resources into a provider configuration. Operations on this type of configuration can often not be completed in a single apply and require breaking up the process using -target, which means it's better to manage the individual configuration layers as separate independent configurations.
The trace logs from the operation causing the error can also help diagnose the problem. The azurerm provider is quite verbose, but we are only interested in the core logs here so setting TF_LOG_CORE=trace will get the graph building details we need. The output is still fairly large though, so it's better to create a separate gist with the content.
here's the trace. I've cut away the state dumps containing secrets. Hope it's still helpful. I've also attached the graph as an svg.
https://gist.github.com/fnordian/903c683f7fbe86071fcb4995b680e7eb
@fnordian, you appear to have dropped a character when copying the gist URL.
sorry, fixed. (it was a b)
Thanks @fnordian. I wasn't counting on there being a change to the azurerm_kubernetes_cluster resource itself, but that of course if why the node is present in this cycle after all.
The problem here is a form of what I described earlier, with the managed resources being used in the provider configuration preventing the entire config from being applied in a single operation. Unfortunately this results in a cycle during apply rather than being detected in a way that could be better presented to the user during plan, and one of the reasons this type of config is not recommended.
The cycle appears because the kubernetes_pod creation depends on the azurerm_kubernetes_cluster, hence the update to azurerm_kubernetes_cluster depends on the destruction of the old kubernetes_pod. Having the provider interposed between these two operations is what introduces the cycle.
I think a workaround here would be to apply a targeted change to azurerm_kubernetes_cluster to ensure it's not present in the dependency graph when the kubernetes resources need to be replaced.
Using data in between resource and provider works around that problem.
Although now I am running into what seems to be https://github.com/hashicorp/terraform-provider-kubernetes/issues/1028
data "azurerm_kubernetes_cluster" "services" {
name = var.cluster_name
resource_group_name = var.resource_group_name
depends_on = [azurerm_kubernetes_cluster.services]
}
provider "kubernetes" {
host = data.azurerm_kubernetes_cluster.services.kube_config.0.host
username = data.azurerm_kubernetes_cluster.services.kube_config.0.username
password = data.azurerm_kubernetes_cluster.services.kube_config.0.password
client_certificate = base64decode(data.azurerm_kubernetes_cluster.services.kube_config.0.client_certificate)
client_key = base64decode(data.azurerm_kubernetes_cluster.services.kube_config.0.client_key)
cluster_ca_certificate = base64decode(data.azurerm_kubernetes_cluster.services.kube_config.0.cluster_ca_certificate)
}
Using the data source does avoid the cycle by disconnecting the direct relationship between the resources, but as you see because that relationship is no longer present, you are going to have ordering issues associated with having a managed resource and data source representing the same logical resource in the configuration.
This is not solvable within a single Terraform configuration, so we can use this issue to represent the situation and work on better error reporting to help direct users to working configurations. The fact that Terraform doesn't fail until apply, and only reports a hard to understand cycle is definitely a usability concern. The recommended solution is still going to be to use multiple independent configurations so that the lifecycle of the resources is not so closely tied together.
Please note that they failure doesn't show on the initial apply, but only on the update.
As provider declarations allow references to other resources, it's hard to understand where the limits of terraform's dependency resolution lie. Could you point to documentation that explains it?
I can tell at least from a (my) user's perspective, it would be great, if terraform was able to handle these situations properly and not rely on the user for orchestration.
For reference, this is documented in Provider Configuration
You can use expressions in the values of these configuration arguments, but can only reference values that are known before the configuration is applied. This means you can safely reference input variables, but not attributes exported by resources
Due to compatibility constraints we are not able to statically detect and error out on these types of references, but the management of these multi-layered configurations is something we're thinking about approaching via other means.
I have the same problem. How to solve the problem? Thanks.
I'm running in the same issue.
It would be interesting to understand more the internals and why the providers can't support attributes exported by resources.
Also in reference to
the management of these multi-layered configurations is something we're thinking about approaching via other means.
What approach are you evaluating for this? Is there any roadmap / timeline?
The possibility to lazy load/configure providers dependent on dynamic information would improve a lot the user experience.
Hi @GiuseppeChiesa-TomTom,
Most instances of this type of cycle should be fixed in a current release (or an upcoming release if it has been triggered by v1.3 specific changes).
The underlying problem with this setup is when a provider requires configuration to create a plan (common with providers like kubernetes), and that configuration depends on a resource attribute which is unknown during the plan. The only way around this is to separately apply the resource changes, then plan again using the dependent provider. With the given design of Terraform, planning and applying these individually is currently best done with separate configurations.
Unfortunately we don't have a public roadmap, but considering the experimental nature of any new approaches, it would be hard to offer a timeline.
@fnordian I had a very similar issue with a managed resource in a provider config and v1.3.4 just fixed it
Closing since the cycle errors should be resolvable in current releases. The logical problems of using a managed resource in a provider configuration still stand, but that is outside of this issue.
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.