terraform-provider-kubernetes icon indicating copy to clipboard operation
terraform-provider-kubernetes copied to clipboard

Type casting/type translation errors cause apply to error out for `kubernetes_manifest` resource

Open roshbhatia opened this issue 2 years ago • 9 comments

Terraform Version, Provider Version and Kubernetes Version

Terraform version: v1.4.5
Kubernetes provider version: v2.19.0
Kubernetes version: v1.24

Affected Resource(s)

  • kubernetes_manifest

Terraform Configuration Files

# Resource in the underlying module, beneath a few layers of abstraction.
resource "kubernetes_manifest" "scaled_object" {
  count = var.enable_horizantal_scaling ? 1 : 0

  manifest = {
    apiVersion = "keda.sh/v1alpha1"
    kind       = "ScaledObject"

    metadata = {
      name      = local.hpa_name
      namespace = var.namespace
    }

    spec = {
      pollingInterval = 30
      cooldownPeriod  = 300
      minReplicaCount = var.replicas
      maxReplicaCount = var.replicas_max

      scaleTargetRef = {
        apiVersion = "apps/v1"
        kind       = "Deployment"
        name       = kubernetes_deployment.app.metadata[0].name
      }

      triggers = local.keda_triggers
    }
  }

  depends_on = [
    resource.kubernetes_deployment.app
  ]
}

Debug Output

https://gist.github.com/roshbhatia/e925f4f7fd53beb4e8cf9e4cea0a715c

Steps to Reproduce

  1. terraform apply

Expected Behavior

Manifest should have been created.

Actual Behavior

Type casting/translation errors cause apply to error out.

References

  • GH-1234 (Fix crash in manifest when configuration contains DynamicPseudotyped unknown values.)

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

roshbhatia avatar Apr 19 '23 16:04 roshbhatia

Hi @roshbhatia, Could you please share how you are configuring the triggers local variable ? Also, the queue_url variable is not set but there is an error in the log messages stating that it cannot be dynamic, could you confirm if this is the actual configuration that caused that log or are there some missing parts?

sheneska avatar Apr 26 '23 14:04 sheneska

Hi there, so sorry for the delay.

The triggers are configured as such in the same module as the resource I noted above:

  sqs_keda_triggers = var.keda_sqs_config == null ? [] : [
    {
      type = "aws-sqs-queue"

      metadata = {
        queueURL      = try(var.keda_sqs_config.queue_url, "")
        queueLength   = try(var.keda_sqs_config.messages_per_replica, "")
        awsRegion     = data.aws_region.current.name
        identityOwner = "operator"
      }
    }
  ]

queue_url should be set, from the output of another module instance for an SQS queue, here's the block of code for keda_sqs_config, in a module instance of the same module that wraps kubernetes_manifest.scaled_object above.

keda_sqs_config = {
    queue_url            = module.assistant_captured_activities_queue.queue_url
    messages_per_replica = 10
}

There are some missing parts to the log that I had removed, but they were all lines prior to the one-liner that states the # of resources that changed -- no other issues presented in the logs before what I pasted in the diff, I can grab them sometime tomorrow.

Any ideas what else I could do to debug? Will make sure to grab the unedited log output in the morning.

roshbhatia avatar Apr 29 '23 04:04 roshbhatia

Sorry I had misunderstood your initial message, I see that you didn't mean the full log, but the full terraform configuration.

Regarding the missing configuration, we call the module that's throwing the error through a few layers of abstraction.

# Instance of the module wrapping the module containing the problematic resource. Other files in the folder have been omitted.
module "deployment_iad" {
  source           = "../modules/v2-service-deployment"
  container_cpu    = local.container_cpu
  container_memory = local.container_memory

  docker_image_repo = var.docker_image_repo
  docker_image_tag  = var.app_version

  environment_variables = local.common_environment_variables
  environment_code      = local.environment_code
  region_code           = local.region_code_iad

  cluster_identity_oidc_issuer = data.aws_eks_cluster.cluster_iad.identity[0].oidc[0].issuer
  acm_certificate_arn          = data.aws_acm_certificate.environment_certificate_iad.arn

  providers = {
    aws        = aws.iad
    kubernetes = kubernetes.iad
  }

  depends_on = [
    data.aws_eks_cluster.cluster_iad,
    data.aws_eks_cluster_auth.cluster_iad
  ]

  auth0_client_id             = local.auth0_client_id
  auth0_client_secret         = local.auth0_client_secret
  auth0_authentication_domain = local.auth0_authentication_domain
  auth0_management_domain     = local.auth0_management_domain

  auth0_audience = local.auth0_audience
  auth0_issuer   = local.auth0_issuer

  otel_debug = false

  mongo_uri = module.atlas_user.connection_strings["${local.environment_code}-${local.region_code_iad}-regional"]
}

# In a different folder/file, this is the definition for the module being instantiated above in "../modules/v2-service-deployment", which is the module instance setting the queue_url. Other files in the folder have been omitted.
module "worker_assistant_captured_activities" {
# This is instantiating the module which contains the kubernetes_manfiest.scaled_object resource I noted above, which is having type conversion issues.
  source = "[email protected]:pinginc/infra.git//modules/v2-aws-eks-app?ref=v1.99.1" 

  # aca - assistant captured activities (abbreviated due to service_name length limit)
  service_name          = "${local.service_name}-worker-aca"
  region_code           = var.region_code
  environment_code      = var.environment_code
  environment_variables = local.environment_variables

  docker_image_repo       = var.docker_image_repo
  docker_image_tag        = var.docker_image_tag
  docker_image_entrypoint = ["yarn", "worker:assistant-captured-activity"]

  container_cpu    = var.container_cpu
  container_memory = var.container_memory

  cluster_identity_oidc_issuer = var.cluster_identity_oidc_issuer

  port = var.port

  namespace = kubernetes_namespace.service_namespace.metadata[0].name

  replicas = 1

  non_interactive = true
  # empty string viable because this is non interactive
  acm_certificate_arn = ""

  keda_sqs_config = {
     queue_url            = module.assistant_captured_activities_queue.queue_url
     messages_per_replica = 10
   }

  additional_permissions = local.application_permissions
}

roshbhatia avatar Apr 30 '23 05:04 roshbhatia

cc @sheneska ^^ RE: the additional context

roshbhatia avatar May 02 '23 20:05 roshbhatia

Is this related to GH-1234?

roshbhatia avatar May 02 '23 20:05 roshbhatia

This is still happening with 2.20.

icco avatar May 09 '23 18:05 icco

I have an issue

│ When applying changes to
│ kubernetes_manifest.cert_manager["/apis/rbac.authorization.k8s.io/v1/namespaces/kube-system/rolebindings/cert-manager-webhook:dynamic-serving"], provider
│ "provider[\"registry.terraform.io/hashicorp/kubernetes\"]" produced an unexpected new value: .object.subjects[0].apiGroup: was cty.StringVal(""), but now null.
│
│ This is a bug in the provider, which should be reported in the provider's own issue tracker.

but not sure if this is the same issue described in this issue.

yongzhang avatar May 12 '23 08:05 yongzhang

I have an issue

│ When applying changes to
│ kubernetes_manifest.cert_manager["/apis/rbac.authorization.k8s.io/v1/namespaces/kube-system/rolebindings/cert-manager-webhook:dynamic-serving"], provider
│ "provider[\"registry.terraform.io/hashicorp/kubernetes\"]" produced an unexpected new value: .object.subjects[0].apiGroup: was cty.StringVal(""), but now null.
│
│ This is a bug in the provider, which should be reported in the provider's own issue tracker.

but not sure if this is the same issue described in this issue.

that's a different issue, you should file a separate ticket.

icco avatar May 17 '23 15:05 icco

Marking this issue as stale due to inactivity. If this issue receives no comments in the next 30 days it will automatically be closed. If this issue was automatically closed and you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. This helps our maintainers find and focus on the active issues. Maintainers may also remove the stale label at their discretion. Thank you!

github-actions[bot] avatar May 17 '24 00:05 github-actions[bot]