terraform-provider-helm icon indicating copy to clipboard operation
terraform-provider-helm copied to clipboard

Kubernetes auth expires during long terraform run

Open ddvdozuki opened this issue 3 years ago • 1 comments

Terraform, Provider, Kubernetes and Helm Versions

Terraform version: 1.0.6
Provider version: 2.3.0
Kubernetes version: 1.20

Affected Resource(s)

  • helm_release

Terraform Configuration Files

provider "helm" {
  kubernetes {
    host                   = data.aws_eks_cluster.main.endpoint
    cluster_ca_certificate = base64decode(data.aws_eks_cluster.main.certificate_authority.0.data)
    token                  = data.aws_eks_cluster_auth.main.token
    exec {
      api_version = "client.authentication.k8s.io/v1alpha1"
      args        = ["eks", "get-token", "--cluster-name", data.aws_eks_cluster.main.name]
      command     = "aws"
    }
  }
}
resource "aws_msk_cluster" "this" {
  count = var.enable_webhooks ? 1 : 0

  cluster_name           = "${local.identifier}-kafka"
  kafka_version          = "2.7.0"
  number_of_broker_nodes = var.azs_count

  broker_node_group_info {
    instance_type   = "kafka.t3.small"
    ebs_volume_size = 50
    client_subnets = data.aws_subnets.private.ids
    security_groups = [join("", aws_security_group.this.*.id)]
  }

  configuration_info {
    arn = aws_msk_configuration.this[0].arn
    revision = aws_msk_configuration.this[0].latest_revision
  }
  encryption_info {
    encryption_in_transit {
      client_broker = "TLS_PLAINTEXT"
    }
  }

  open_monitoring {
    prometheus {
      jmx_exporter {
        enabled_in_broker = true
      }
      node_exporter {
        enabled_in_broker = true
      }
    }
  }

  tags = local.tags
}
resource "helm_release" "frontegg" {
  count = var.enable_webhooks ? 1 : 0

  depends_on = [
    helm_release.mongodb,
    helm_release.redis
  ]

  name  = "frontegg"
  chart = "charts/connectivity"

  namespace = "default"

  reuse_values = true

  values = [
    file("static/webhooks_values.yml")
  ]
// - Kafka - //
  set {
    name  = "webhook-service.messageBroker.brokerList"
    value = replace(aws_msk_cluster.this[0].bootstrap_brokers,",","\\,")
  }
  set {
    name  = "event-service.messageBroker.brokerList"
    value = replace(aws_msk_cluster.this[0].bootstrap_brokers,",","\\,")
  }
  set {
    name  = "integrations-service.messageBroker.brokerList"
    value = replace(aws_msk_cluster.this[0].bootstrap_brokers,",","\\,")
  }
  set {
    name  = "connectors-worker.messageBroker.brokerList"
    value = replace(aws_msk_cluster.this[0].bootstrap_brokers,",","\\,")
  }
}

Debug Output

https://gist.github.com/ddvdozuki/f39aed0657761a768d13e7684054b859

Steps to Reproduce

  1. terraform apply

Expected Behavior

Helm release should be installed successfully

Actual Behavior

The install fails due to an expired auth token I assume.

Important Factoids

When running the apply again it succeeds no problem but obviously it doesn't have to wait for 30+ minutes for the MSK cluster to deploy.

I think the main issue is that helm releases happen at the beginning of the run and then the MSK cluster gets provisioned for 30 minutes, then another helm release is done. Somewhere in that 30 minutes the auth expires I guess and so it fails.

References

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

ddvdozuki avatar Sep 07 '21 21:09 ddvdozuki

Marking this issue as stale due to inactivity. If this issue receives no comments in the next 30 days it will automatically be closed. If this issue was automatically closed and you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. This helps our maintainers find and focus on the active issues. Maintainers may also remove the stale label at their discretion. Thank you!

github-actions[bot] avatar Sep 08 '22 00:09 github-actions[bot]

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

github-actions[bot] avatar Nov 07 '22 02:11 github-actions[bot]