terraform-provider-kubernetes icon indicating copy to clipboard operation
terraform-provider-kubernetes copied to clipboard

kubernetes_manifest panics when dealing list of mismatched object shapes

Open KenFigueiredo opened this issue 2 years ago • 5 comments

Hi there, In testing using the kubernetes_manifest for a cluster resource supported by: https://github.com/kubernetes-sigs/cluster-api-provider-vsphere, I'm seeing the following failure which seems to be a byproduct of the provider not properly handling of mixed tuples, which is valid for the spec.

Terraform Version, Provider Version and Kubernetes Version

Terraform version: v1.5.2
Kubernetes provider version: v2.21.1
Kubernetes version: v1.24

Affected Resource(s)

  • kubernetes_manifest

Terraform Configuration Files

resource "kubernetes_manifest" "repro_cluster" {
  manifest = {
    apiVersion = "cluster.x-k8s.io/v1beta1"
    kind       = "Cluster"

    metadata = {
      name        = "repro"
      namespace   = var.cluster_namespace
    }

    spec = {
      clusterNetwork = {
        pods          = { cidrBlocks = var.default_pod_cidr_blocks }
        services      = { cidrBlocks = var.default_service_cidr_blocks }
        serviceDomain = "cluster.local"
      }

      topology = {
        class = "tanzukubernetescluster"
        version = "v1.24..."

        controlPlane = {
          replicas     = var.control_plane_replicas
        }

        workers = {
          machineDeployments = [{
            class = "node-pool"
            name = "repro_pool"
            replicas = 1

            variables = {
              overrides = [
                { name = "vmClass", value = "best-effort-large" },
                # Uncommenting this line results in an error:
                # AttributeName("value"): can't use tftypes.String as tftypes.Tuple[tftypes.Object["effect":tftypes.String,│ "key":tftypes.String, "value":tftypes.String]]
                # { name = "nodePoolTaints", value = [{ key = "foo", effect = "Equals", value = "bar" }] },
              ]
            }
          }]
        }

        variables = [
          # Uncommenting either of these lines results in a panic
          # { name = "nodePoolVolumes", value = [] },
          # { name = "nodePoolTaints", value = [] },
          { name = "storageClass", value = var.default_storage_class },
          { name = "vmClass", value = var.control_plane_vm_class } 
        ]
      }
    }
  }
}

Debug Output

Panic Output

Stack trace from the terraform-provider-kubernetes_v2.21.1_x5 plugin:

panic: interface conversion: tftypes.Type is tftypes.primitive, not tftypes.Tuple

goroutine 317 [running]:
github.com/hashicorp/terraform-provider-kubernetes/manifest/morph.DeepUnknown({0x10640bc28, 0x1400221f6b0?}, {{0x10640bc80?, 0x1400221fbf0?}, {0x106084f00?, 0x14002200b80?}}, 0x140010c19e0)
        github.com/hashicorp/terraform-provider-kubernetes/manifest/morph/scaffold.go:65 +0x16ec
github.com/hashicorp/terraform-provider-kubernetes/manifest/morph.DeepUnknown({0x10640bb78, 0x14002abdb30?}, {{0x10640bb78?, 0x14002be2360?}, {0x1061440a0?, 0x14002be2090?}}, 0x140010c19b0)
        github.com/hashicorp/terraform-provider-kubernetes/manifest/morph/scaffold.go:32 +0x18c4
github.com/hashicorp/terraform-provider-kubernetes/manifest/morph.DeepUnknown({0x10640bc28, 0x14002abdd10?}, {{0x10640bc28?, 0x14002be2d50?}, {0x1060322e0?, 0x140010c0eb8?}}, 0x140010c18c0)
        github.com/hashicorp/terraform-provider-kubernetes/manifest/morph/scaffold.go:81 +0x14e8
github.com/hashicorp/terraform-provider-kubernetes/manifest/morph.DeepUnknown({0x10640bb78, 0x14002b2e2a0?}, {{0x10640bb78?, 0x14002be73b0?}, {0x1061440a0?, 0x14002bdf140?}}, 0x140010c1710)
        github.com/hashicorp/terraform-provider-kubernetes/manifest/morph/scaffold.go:32 +0x18c4
github.com/hashicorp/terraform-provider-kubernetes/manifest/morph.DeepUnknown({0x10640bb78, 0x14002b2e2d0?}, {{0x10640bb78?, 0x14002be7a40?}, {0x1061440a0?, 0x14002bdda10?}}, 0x140010c1548)
        github.com/hashicorp/terraform-provider-kubernetes/manifest/morph/scaffold.go:32 +0x18c4
github.com/hashicorp/terraform-provider-kubernetes/manifest/morph.DeepUnknown({0x10640bb78, 0x14002bdcf30?}, {{0x10640bb78?, 0x14002be82a0?}, {0x1061440a0?, 0x14002bdd1a0?}}, 0x140010c11e8)
        github.com/hashicorp/terraform-provider-kubernetes/manifest/morph/scaffold.go:32 +0x18c4
github.com/hashicorp/terraform-provider-kubernetes/manifest/provider.(*RawProviderServer).PlanResourceChange(0x140011e2e00, {0x1064057a0, 0x140021e4240}, 0x14000f391d0)
        github.com/hashicorp/terraform-provider-kubernetes/manifest/provider/plan.go:369 +0x3144
github.com/hashicorp/terraform-plugin-mux/tf5muxserver.muxServer.PlanResourceChange({0x14001075b30, 0x14001075b90, {0x140012d6880, 0x2, 0x2}, 0x14001075b60, 0x14001064e90, 0x140012aa7e0, 0x14001075bc0}, {0x1064057a0?, ...}, ...)
        github.com/hashicorp/[email protected]/tf5muxserver/mux_server_PlanResourceChange.go:27 +0x108
github.com/hashicorp/terraform-plugin-go/tfprotov5/tf5server.(*server).PlanResourceChange(0x14000212b40, {0x1064057a0?, 0x140021d77d0?}, 0x140040b7c00)
        github.com/hashicorp/[email protected]/tfprotov5/tf5server/server.go:783 +0x3b8
github.com/hashicorp/terraform-plugin-go/tfprotov5/internal/tfplugin5._Provider_PlanResourceChange_Handler({0x10630c8e0?, 0x14000212b40}, {0x1064057a0, 0x140021d77d0}, 0x140040b7b90, 0x0)
        github.com/hashicorp/[email protected]/tfprotov5/internal/tfplugin5/tfplugin5_grpc.pb.go:367 +0x170
google.golang.org/grpc.(*Server).processUnaryRPC(0x14000c983c0, {0x10640c1c0, 0x140002ee1a0}, 0x140021cd200, 0x140012fbd70, 0x1072a1be8, 0x0)
        google.golang.org/[email protected]/server.go:1340 +0xb7c
google.golang.org/grpc.(*Server).handleStream(0x14000c983c0, {0x10640c1c0, 0x140002ee1a0}, 0x140021cd200, 0x0)
        google.golang.org/[email protected]/server.go:1713 +0x82c
google.golang.org/grpc.(*Server).serveStreams.func1.2()
        google.golang.org/[email protected]/server.go:965 +0x84
created by google.golang.org/grpc.(*Server).serveStreams.func1
        google.golang.org/[email protected]/server.go:963 +0x290

Error: The terraform-provider-kubernetes_v2.21.1_x5 plugin crashed!

Steps to Reproduce

  1. Setup terraform file, uncomment line ( { name = "nodePoolTaints", value = [{ key = "foo", effect = "Equals", value = "bar" }] })
  2. terraform plan
  3. Errors out with AttributeName("value"): can't use tftypes.String as tftypes.Tuple[tftypes.Object["effect":tftypes.String,│ "key":tftypes.String, "value":tftypes.String]]
  4. Comment previous line and uncomment line ({ name = "nodePoolVolumes", value = [] },)
  5. terraform plan
  6. panic

Expected Behavior

What should have happened?

No panic, terraform plan runs correctly for the valid CRD spec.

Actual Behavior

What actually happened?

Error / Plugin Panic

Important Factoids

  • N/A

References

  • potentially: https://github.com/hashicorp/terraform/issues/22405

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

KenFigueiredo avatar Jun 29 '23 20:06 KenFigueiredo

Thanks for reporting this!

The fix for this issue is likely in this PR: https://github.com/hashicorp/terraform-provider-kubernetes/pull/2164. The next provider release should include it.

I will try to confirm in advance if this issue falls under the above fix and let you know here.

alexsomesan avatar Jun 30 '23 14:06 alexsomesan

I'm unable to reproduce this issue. I followed the instructions in "Steps to reproduce" and I always get a plan as expected. Here's what I tried:

➜  issue-2170 cat test.tf
variable "cluster_namespace" {
  default = "default"
}

variable "default_pod_cidr_blocks" {
  default = ["0.0.0.0/24"]
}

variable "default_service_cidr_blocks" {
  default = ["0.0.0.0/24"]
}

variable "control_plane_replicas" {
  default = 1
}

variable "control_plane_vm_class" {
  default = "default"
}

variable "default_storage_class" {
  default = "default"
}

resource "kubernetes_manifest" "repro_cluster" {
  manifest = {
    apiVersion = "cluster.x-k8s.io/v1beta1"
    kind       = "Cluster"

    metadata = {
      name        = "repro"
      namespace   = var.cluster_namespace
    }

    spec = {
      clusterNetwork = {
        pods          = { cidrBlocks = var.default_pod_cidr_blocks }
        services      = { cidrBlocks = var.default_service_cidr_blocks }
        serviceDomain = "cluster.local"
      }

      topology = {
        class = "tanzukubernetescluster"
        version = "v1.24..."

        controlPlane = {
          replicas     = var.control_plane_replicas
        }

        workers = {
          machineDeployments = [{
            class = "node-pool"
            name = "repro_pool"
            replicas = 1

            variables = {
              overrides = [
                { name = "vmClass", value = "best-effort-large" },
                # Uncommenting this line results in an error:
                # AttributeName("value"): can't use tftypes.String as tftypes.Tuple[tftypes.Object["effect":tftypes.String,│ "key":tftypes.String, "value":tftypes.String]]
                { name = "nodePoolTaints", value = [{ key = "foo", effect = "Equals", value = "bar" }] },
              ]
            }
          }]
        }

        variables = [
          # Uncommenting either of these lines results in a panic
          { name = "nodePoolVolumes", value = [] },
          { name = "nodePoolTaints", value = [] },
          { name = "storageClass", value = var.default_storage_class },
          { name = "vmClass", value = var.control_plane_vm_class }
        ]
      }
    }
  }
}

➜  issue-2170 terraform plan
╷
│ Warning: Provider development overrides are in effect
│
│ The following provider development overrides are set in the CLI configuration:
│  - hashicorp/kubernetes in /Users/alex/work/terraform-provider-kubernetes
│
│ The behavior may therefore not match any released version of the provider and applying changes may cause the state to become incompatible with published releases.
╵

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # kubernetes_manifest.repro_cluster will be created
  + resource "kubernetes_manifest" "repro_cluster" {
      + manifest = {
          + apiVersion = "cluster.x-k8s.io/v1beta1"
          + kind       = "Cluster"
          + metadata   = {
              + name      = "repro"
              + namespace = "default"
            }
          + spec       = {
              + clusterNetwork = {
                  + pods          = {
                      + cidrBlocks = [
                          + "0.0.0.0/24",
                        ]
                    }
                  + serviceDomain = "cluster.local"
                  + services      = {
                      + cidrBlocks = [
                          + "0.0.0.0/24",
                        ]
                    }
                }
              + topology       = {
                  + class        = "tanzukubernetescluster"
                  + controlPlane = {
                      + replicas = 1
                    }
                  + variables    = [
                      + {
                          + name  = "nodePoolVolumes"
                          + value = []
                        },
                      + {
                          + name  = "nodePoolTaints"
                          + value = []
                        },
                      + {
                          + name  = "storageClass"
                          + value = "default"
                        },
                      + {
                          + name  = "vmClass"
                          + value = "default"
                        },
                    ]
                  + version      = "v1.24..."
                  + workers      = {
                      + machineDeployments = [
                          + {
                              + class     = "node-pool"
                              + name      = "repro_pool"
                              + replicas  = 1
                              + variables = {
                                  + overrides = [
                                      + {
                                          + name  = "vmClass"
                                          + value = "best-effort-large"
                                        },
                                      + {
                                          + name  = "nodePoolTaints"
                                          + value = [
                                              + {
                                                  + effect = "Equals"
                                                  + key    = "foo"
                                                  + value  = "bar"
                                                },
                                            ]
                                        },
                                    ]
                                }
                            },
                        ]
                    }
                }
            }
        }
      + object   = {
          + apiVersion = "cluster.x-k8s.io/v1beta1"
          + kind       = "Cluster"
          + metadata   = {
              + annotations                = (known after apply)
              + creationTimestamp          = (known after apply)
              + deletionGracePeriodSeconds = (known after apply)
              + deletionTimestamp          = (known after apply)
              + finalizers                 = (known after apply)
              + generateName               = (known after apply)
              + generation                 = (known after apply)
              + labels                     = (known after apply)
              + managedFields              = (known after apply)
              + name                       = "repro"
              + namespace                  = "default"
              + ownerReferences            = (known after apply)
              + resourceVersion            = (known after apply)
              + selfLink                   = (known after apply)
              + uid                        = (known after apply)
            }
          + spec       = {
              + clusterNetwork       = {
                  + apiServerPort = (known after apply)
                  + pods          = {
                      + cidrBlocks = [
                          + "0.0.0.0/24",
                        ]
                    }
                  + serviceDomain = "cluster.local"
                  + services      = {
                      + cidrBlocks = [
                          + "0.0.0.0/24",
                        ]
                    }
                }
              + controlPlaneEndpoint = {
                  + host = (known after apply)
                  + port = (known after apply)
                }
              + controlPlaneRef      = {
                  + apiVersion      = (known after apply)
                  + fieldPath       = (known after apply)
                  + kind            = (known after apply)
                  + name            = (known after apply)
                  + namespace       = (known after apply)
                  + resourceVersion = (known after apply)
                  + uid             = (known after apply)
                }
              + infrastructureRef    = {
                  + apiVersion      = (known after apply)
                  + fieldPath       = (known after apply)
                  + kind            = (known after apply)
                  + name            = (known after apply)
                  + namespace       = (known after apply)
                  + resourceVersion = (known after apply)
                  + uid             = (known after apply)
                }
              + paused               = (known after apply)
              + topology             = {
                  + class        = "tanzukubernetescluster"
                  + controlPlane = {
                      + machineHealthCheck      = {
                          + enable              = (known after apply)
                          + maxUnhealthy        = (known after apply)
                          + nodeStartupTimeout  = (known after apply)
                          + remediationTemplate = {
                              + apiVersion      = (known after apply)
                              + fieldPath       = (known after apply)
                              + kind            = (known after apply)
                              + name            = (known after apply)
                              + namespace       = (known after apply)
                              + resourceVersion = (known after apply)
                              + uid             = (known after apply)
                            }
                          + unhealthyConditions = (known after apply)
                          + unhealthyRange      = (known after apply)
                        }
                      + metadata                = {
                          + annotations = (known after apply)
                          + labels      = (known after apply)
                        }
                      + nodeDeletionTimeout     = (known after apply)
                      + nodeDrainTimeout        = (known after apply)
                      + nodeVolumeDetachTimeout = (known after apply)
                      + replicas                = 1
                    }
                  + rolloutAfter = (known after apply)
                  + variables    = [
                      + {
                          + definitionFrom = (known after apply)
                          + name           = "nodePoolVolumes"
                          + value          = []
                        },
                      + {
                          + definitionFrom = (known after apply)
                          + name           = "nodePoolTaints"
                          + value          = []
                        },
                      + {
                          + definitionFrom = (known after apply)
                          + name           = "storageClass"
                          + value          = "default"
                        },
                      + {
                          + definitionFrom = (known after apply)
                          + name           = "vmClass"
                          + value          = "default"
                        },
                    ]
                  + version      = "v1.24..."
                  + workers      = {
                      + machineDeployments = [
                          + {
                              + class                   = "node-pool"
                              + failureDomain           = (known after apply)
                              + machineHealthCheck      = {
                                  + enable              = (known after apply)
                                  + maxUnhealthy        = (known after apply)
                                  + nodeStartupTimeout  = (known after apply)
                                  + remediationTemplate = {
                                      + apiVersion      = (known after apply)
                                      + fieldPath       = (known after apply)
                                      + kind            = (known after apply)
                                      + name            = (known after apply)
                                      + namespace       = (known after apply)
                                      + resourceVersion = (known after apply)
                                      + uid             = (known after apply)
                                    }
                                  + unhealthyConditions = (known after apply)
                                  + unhealthyRange      = (known after apply)
                                }
                              + metadata                = {
                                  + annotations = (known after apply)
                                  + labels      = (known after apply)
                                }
                              + minReadySeconds         = (known after apply)
                              + name                    = "repro_pool"
                              + nodeDeletionTimeout     = (known after apply)
                              + nodeDrainTimeout        = (known after apply)
                              + nodeVolumeDetachTimeout = (known after apply)
                              + replicas                = 1
                              + strategy                = {
                                  + rollingUpdate = {
                                      + deletePolicy   = (known after apply)
                                      + maxSurge       = (known after apply)
                                      + maxUnavailable = (known after apply)
                                    }
                                  + type          = (known after apply)
                                }
                              + variables               = {
                                  + overrides = [
                                      + {
                                          + definitionFrom = (known after apply)
                                          + name           = "vmClass"
                                          + value          = "best-effort-large"
                                        },
                                      + {
                                          + definitionFrom = (known after apply)
                                          + name           = "nodePoolTaints"
                                          + value          = [
                                              + {
                                                  + effect = "Equals"
                                                  + key    = "foo"
                                                  + value  = "bar"
                                                },
                                            ]
                                        },
                                    ]
                                }
                            },
                        ]
                    }
                }
            }
        }
    }

Plan: 1 to add, 0 to change, 0 to destroy.

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Note: You didn't use the -out option to save this plan, so Terraform can't guarantee to take exactly these actions if you run "terraform apply" now.

➜  issue-2170 terraform version
Terraform v1.5.2
on darwin_arm64
+ provider registry.terraform.io/hashicorp/kubernetes v2.21.1

alexsomesan avatar Jun 30 '23 17:06 alexsomesan

Hey @alexsomesan - thanks for taking a look at this. I'm realizing that in my steps to reproduce I left out that I'm working from an already applied state, so that could be the missing piece for the crash.

KenFigueiredo avatar Jul 06 '23 14:07 KenFigueiredo

The panic seems resolved after upgrading to v2.22.0 of the module, but I seem to be hitting another related issue to this.

│ When applying changes to kubernetes_manifest.repro_cluster,
│ provider "provider[\"registry.terraform.io/hashicorp/kubernetes\"].vsphere_k8s" produced an unexpected new
│ value: .object: wrong final value type: attribute "spec": attribute "topology": attribute "variables": tuple required.
│ 
│ This is a bug in the provider, which should be reported in the provider's own issue tracker.
╵

It seems like this is happening because spec.topology.variables has items added in an unpredictable order. I've tried to circumvent this issue by manually defining the items in the list in terraform, ex:

        variables = [
          { name = "TKR_DATA", value = {} },
          { name = "ntp", value = "" },
          ...
       ]

  computed_fields = [
    "spec.topology.variables"
 ]

but then I run into a similar issue since the expected value type changes for the order of items since the Cluster API provider seems to alter the order of the items as it populates it with the default values.

│ Error: Provider produced inconsistent result after apply
│ 
│ When applying changes to module.vsphere_cluster["kentest-cluster-delete-me"].kubernetes_manifest.tanzu_cluster[0],
│ provider "provider[\"registry.terraform.io/hashicorp/kubernetes\"].vsphere_sprocket_mgmt" produced an unexpected new
│ value: .object: wrong final value type: incorrect object attributes.

ref variable cluster spec: https://docs.vmware.com/en/VMware-vSphere/8.0/vsphere-with-tanzu-tkg/GUID-69E52B31-6DEC-412D-B60E-FE733156F708.html#default-clusterclass-tanzukubernetescluster-1

Is this something that can be solved at this provider layer?

KenFigueiredo avatar Aug 07 '23 13:08 KenFigueiredo

Marking this issue as stale due to inactivity. If this issue receives no comments in the next 30 days it will automatically be closed. If this issue was automatically closed and you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. This helps our maintainers find and focus on the active issues. Maintainers may also remove the stale label at their discretion. Thank you!

github-actions[bot] avatar Aug 07 '24 00:08 github-actions[bot]