terraform-provider-kubernetes icon indicating copy to clipboard operation
terraform-provider-kubernetes copied to clipboard

Non-deterministic failure to morph when creating a number of CRDs

Open toddgardner opened this issue 3 years ago • 11 comments

Terraform Version, Provider Version and Kubernetes Version

Terraform version: 1.0.3
Kubernetes provider version: 2.4.1
Kubernetes version:
Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.4", GitCommit:"d360454c9bcd1634cf4cc52d1867af5491dc9c5f", GitTreeState:"clean", BuildDate:"2020-11-14T14:49:35Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"20+", GitVersion:"v1.20.7-eks-d88609", GitCommit:"d886092805d5cc3a47ed5cf0c43de38ce442dfcb", GitTreeState:"clean", BuildDate:"2021-07-31T00:29:12Z", GoVersion:"go1.15.12", Compiler:"gc", Platform:"linux/amd64"}

Affected Resource(s)

  • kubernetes_manifest

Terraform Configuration Files

First file in: https://gist.github.com/toddgardner/7ccdb15fa7c82382587324b7e3f29fc9

(it requires >3 CRDs in my testing to be consistent so it's quite long.

Full reproducer repo here: https://github.com/toddgardner/terraform_k8s_failure_to_morph

Debug Output

Second file in: https://gist.github.com/toddgardner/7ccdb15fa7c82382587324b7e3f29fc9

Panic Output

Does not panic

Steps to Reproduce

  1. git clone https://github.com/toddgardner/terraform_k8s_failure_to_morph.git
  2. cd terraform_k8s_failure_to_morph
  3. terraform init
  4. terraform apply

Expected Behavior

It should have created the CRDs

Actual Behavior

It produced an error

Important Factoids

This was a difficult one to get to something simple because each CRD creates fine on it's own. it starts failing around any 3 of the CRDs, and then consistently fails on higher numbers; I chose to put 10 CRDs in the reproducer so it was more consistent.

It the debug output you can see if failed on 7 of them out of the 10. For example "customresourcedefinition_backendpolicies_networking_x_k8s_io" failed. However, if we limit the file to just that resource, it succeeds, here's a debug run that shows that: https://gist.github.com/toddgardner/56d8bd140f23099f07e1634d6c7fb766

So it does not appear to be related to the content of the CRDs but the number.

This is on a stock EKS cluster with very little on it. The CRDs are from cert-manager and contour which I generated with tfk8s on their respective default installs.

References

  • Filed this under the other repo: https://github.com/hashicorp/terraform-provider-kubernetes-alpha/issues/261

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

toddgardner avatar Aug 04 '21 18:08 toddgardner

I too have encountered this while trying to terraform the knative CRDs (if you need another test case).

curtbushko avatar Aug 24 '21 18:08 curtbushko

Has anyone got a work around for this?

mcfedr avatar Feb 14 '22 07:02 mcfedr

This is a regression for me in v2.8 - resource was working fine with 2.7.1 - i guess related to https://github.com/hashicorp/terraform-provider-kubernetes/pull/1590

I have a cert manager issuer config like this

resource "kubernetes_manifest" "issuer_prod" {
  manifest = {
    apiVersion = "cert-manager.io/v1"
    kind       = "ClusterIssuer"
    metadata = {
      name = "letsencrypt-prod"
    }
    spec = {
      acme = {
        server = "https://acme-v02.api.letsencrypt.org/directory"
        email  = var.le_email
        privateKeySecretRef = {
          name = "letsencrypt-prod"
        }
        solvers = [
          {
            dns01 = {
              digitalocean = {
                tokenSecretRef = {
                  name = kubernetes_secret.do.metadata[0].name
                  key  = "access-token"
                }
              }
            }
            selector = {
              dnsZones = [
                "example.com",
                "example1.com",
              ]
            }
          },
          {
            http01 = {
              ingress = {
                class = var.ingress_class
              }
            }
          },
        ]
      }
    }
  }
}

the full error

 Error: Failed to morph manifest to OAPI type
│
│   with module.cert_manager.kubernetes_manifest.issuer_prod,
│   on cert_manager/issuers.tf line 49, in resource "kubernetes_manifest" "issuer_prod":
│   49: resource "kubernetes_manifest" "issuer_prod" {
│
│ AttributeName("spec"): [AttributeName("spec")] failed to morph object element into object element: AttributeName("spec").AttributeName("acme"): [AttributeName("spec").AttributeName("acme")] failed to morph object
│ element into object element: AttributeName("spec").AttributeName("acme").AttributeName("solvers"): [AttributeName("spec").AttributeName("acme").AttributeName("solvers")] failed to morph object element into object
│ element: AttributeName("spec").AttributeName("acme").AttributeName("solvers"): [AttributeName("spec").AttributeName("acme").AttributeName("solvers")] failed to morph list into tuple (length mismatch)
╵

Seems to be trying to convert the solvers list from tuple to list, so I've tried with tolist but this fails because of cannot convert tuple to list of any single type. - but its also not clear from the error reading the hcl or reading the state from k8s. the resource is already there, im trying to edit a different resources in my module.

mcfedr avatar Feb 14 '22 07:02 mcfedr

@mcfedr We had the same issue come up yesterday with 2.8.0 and opened an Github issue: #1603, although it could be underlined by a different cause.

We ran into the error when our kubeneretes provider jumped from 2.6.1 (our last successful run) to 2.8.0. The issue was interestingly resolved by doing a rolling update of our k8s provider in a sequential manner: 2.6.1. -> 2.7.1 -> 2.8.0. Not sure if this solution is applicable as you are moving from 2.7.1 to 2.8.0.

hyha0310 avatar Feb 14 '22 19:02 hyha0310

We just had the same problem, on a different manifest resource, but also on a list and with the same error message failed to morph list into tuple (length mismatch).

│  142: resource "kubernetes_manifest" "datadog_agent" {
│ 
│ AttributeName("spec"): [AttributeName("spec")] failed to morph object
│ element into object element:
│ AttributeName("spec").AttributeName("clusterChecksRunner"):
│ [AttributeName("spec").AttributeName("clusterChecksRunner")] failed to
│ morph object element into object element:
│ AttributeName("spec").AttributeName("clusterChecksRunner").AttributeName("config"):
│ [AttributeName("spec").AttributeName("clusterChecksRunner").AttributeName("config")]
│ failed to morph object element into object element:
│ AttributeName("spec").AttributeName("clusterChecksRunner").AttributeName("config").AttributeName("volumes"):
│ [AttributeName("spec").AttributeName("clusterChecksRunner").AttributeName("config").AttributeName("volumes")]
│ failed to morph object element into object element:
│ AttributeName("spec").AttributeName("clusterChecksRunner").AttributeName("config").AttributeName("volumes"):
│ [AttributeName("spec").AttributeName("clusterChecksRunner").AttributeName("config").AttributeName("volumes")]
│ failed to morph list into tuple (length mismatch)

The issue came up after upgrading the provider from 2.7.1 to 2.8.0. Going back to 2.7.1 solves it, but we cannot upgrade.

miguelaferreira avatar Feb 15 '22 20:02 miguelaferreira

I've run into this issue as well and this was the work around for me. I still don't know what's really causing it, mostly likely the bugs mentioned above. It was working fine with 2.8.0, then I believe a plan was accidentally run with an older version of the provider, so the error appeared. What I did was I rolled back to 2.7.1, planned and applied, then went back to 2.8.0 again and it worked fine.

Error: Failed to morph manifest to OAPI type

AttributeName("spec"): [AttributeName("spec")] failed to morph object element
into object element: AttributeName("spec").AttributeName("http"):
[AttributeName("spec").AttributeName("http")] failed to morph object element
into object element:
AttributeName("spec").AttributeName("http").ElementKeyInt(0):
[AttributeName("spec").AttributeName("http").ElementKeyInt(0)] failed to morph
list element into list element:
AttributeName("spec").AttributeName("http").ElementKeyInt(0).AttributeName("redirect"):
[AttributeName("spec").AttributeName("http").ElementKeyInt(0).AttributeName("redirect")]
failed to morph object element into object element:
AttributeName("spec").AttributeName("http").ElementKeyInt(0).AttributeName("redirect").AttributeName("derivePort"):
[AttributeName("spec").AttributeName("http").ElementKeyInt(0).AttributeName("redirect").AttributeName("derivePort")]
failed to morph object element into object element:
AttributeName("spec").AttributeName("http").ElementKeyInt(0).AttributeName("redirect").AttributeName("derivePort"):
type is nil

hilyas avatar Feb 25 '22 12:02 hilyas

Also seeing this when upgrading from 2.7.1 to 2.8.0

╷
│ Error: Failed to morph manifest to OAPI type
│
│   with module.elasticsearch["staging"].kubernetes_manifest.elasticsearch_elasticsearch,
│   on modules/elasticsearch/elasticsearch.tf line 76, in resource "kubernetes_manifest" "elasticsearch_elasticsearch":
│   76: resource "kubernetes_manifest" "elasticsearch_elasticsearch" {
│
│ AttributeName("spec"): [AttributeName("spec")] failed to morph object element into object element: AttributeName("spec").AttributeName("nodeSets"): [AttributeName("spec").AttributeName("nodeSets")] failed to morph object element into object element: AttributeName("spec").AttributeName("nodeSets"): [AttributeName("spec").AttributeName("nodeSets")] failed to morph list into tuple (length mismatch)
╵
╷
│ Error: Failed to morph manifest to OAPI type
│
│   with module.kafka_cluster.kubernetes_manifest.kafka_kafka_strimzi,
│   on modules/kafka/kafka.tf line 1, in resource "kubernetes_manifest" "kafka_kafka_strimzi":
│    1: resource "kubernetes_manifest" "kafka_kafka_strimzi" {
│
│ AttributeName("spec"): [AttributeName("spec")] failed to morph object element into object element: AttributeName("spec").AttributeName("kafka"): [AttributeName("spec").AttributeName("kafka")] failed to morph object element into object element: AttributeName("spec").AttributeName("kafka").AttributeName("listeners"): [AttributeName("spec").AttributeName("kafka").AttributeName("listeners")] failed to morph object element into object element: AttributeName("spec").AttributeName("kafka").AttributeName("listeners"):
│ [AttributeName("spec").AttributeName("kafka").AttributeName("listeners")] failed to morph list into tuple (length mismatch)
╵

warwick-mitchell1 avatar Feb 28 '22 12:02 warwick-mitchell1

Found an only slightly annoying work around, remove the resource from state and then reimport it

terraform state rm module.kafka_cluster.kubernetes_manifest.kafka_kafka_strimzi
terraform import module.kafka_cluster.kubernetes_manifest.kafka_kafka_strimzi 'apiVersion=kafka.strimzi.io/v1beta2,kind=Kafka,namespace=kafka,name=kafka-strimzi'

warwick-mitchell1 avatar Feb 28 '22 12:02 warwick-mitchell1

I ran into this issue and upon digging deeper, the issue was that there was a typo in my manifest file which made it invalid.

It's possible your manifest is wrong and kubernetes is refusing it; the Failed to morph manifest to OAPI type error might just be an obscure way that this issue is being bubbled up.

huguesalary avatar May 20 '22 19:05 huguesalary

@toddgardner The configuration originally reported in the issue description works as expected with the latest provider version (2.11.0) as long as the preserveUnknownFields attribute is removed (deprecated in apiextensions.k8s.io/v1 CRD).

Alternatively this field can be marked as computed, but serves in fact no purpose.

resource "kubernetes_manifest" "customresourcedefinition_..." {
  computed_fields = [ "spec.preserveUnknownFields" ]
  manifest = {
...
Plan: 10 to add, 0 to change, 0 to destroy.
kubernetes_manifest.customresourcedefinition_httpproxies_projectcontour_io: Creating...
kubernetes_manifest.customresourcedefinition_extensionservices_projectcontour_io: Creating...
kubernetes_manifest.customresourcedefinition_tlscertificatedelegations_projectcontour_io: Creating...
kubernetes_manifest.customresourcedefinition_backendpolicies_networking_x_k8s_io: Creating...
kubernetes_manifest.customresourcedefinition_gatewayclasses_networking_x_k8s_io: Creating...
kubernetes_manifest.customresourcedefinition_certificaterequests_cert_manager_io: Creating...
kubernetes_manifest.customresourcedefinition_contours_operator_projectcontour_io: Creating...
kubernetes_manifest.customresourcedefinition_gateways_networking_x_k8s_io: Creating...
kubernetes_manifest.customresourcedefinition_httproutes_networking_x_k8s_io: Creating...
kubernetes_manifest.customresourcedefinition_tcproutes_networking_x_k8s_io: Creating...
kubernetes_manifest.customresourcedefinition_httpproxies_projectcontour_io: Creation complete after 4s
kubernetes_manifest.customresourcedefinition_tlscertificatedelegations_projectcontour_io: Creation complete after 5s
kubernetes_manifest.customresourcedefinition_extensionservices_projectcontour_io: Creation complete after 6s
kubernetes_manifest.customresourcedefinition_backendpolicies_networking_x_k8s_io: Creation complete after 6s
kubernetes_manifest.customresourcedefinition_gatewayclasses_networking_x_k8s_io: Creation complete after 6s
kubernetes_manifest.customresourcedefinition_certificaterequests_cert_manager_io: Creation complete after 6s
kubernetes_manifest.customresourcedefinition_contours_operator_projectcontour_io: Creation complete after 7s
kubernetes_manifest.customresourcedefinition_gateways_networking_x_k8s_io: Creation complete after 7s
kubernetes_manifest.customresourcedefinition_httproutes_networking_x_k8s_io: Creation complete after 5s
kubernetes_manifest.customresourcedefinition_tcproutes_networking_x_k8s_io: Creation complete after 5s

Apply complete! Resources: 10 added, 0 changed, 0 destroyed.

alexsomesan avatar May 21 '22 06:05 alexsomesan

For the other issues mentioned in the comments, could I bother you to open individual GH issues and more importantly, provide some sample configuration that reproduces the issue?

The presumed causes of some of the issues in the comments are not related to the OG report. I'd like to look into each case in it's own context.

Thanks!

alexsomesan avatar May 21 '22 06:05 alexsomesan

Marking this issue as stale due to inactivity. If this issue receives no comments in the next 30 days it will automatically be closed. If this issue was automatically closed and you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. This helps our maintainers find and focus on the active issues. Maintainers may also remove the stale label at their discretion. Thank you!

github-actions[bot] avatar May 22 '23 00:05 github-actions[bot]