external-dns icon indicating copy to clipboard operation
external-dns copied to clipboard

External-dns pod crashes and is stuck in a restart loop due to a fatal error related to the ManagedIdentityCredential

Open Shadikho opened this issue 1 year ago • 3 comments

I have an AKS cluster which has been using the bitnami external-dns helm chart (based on this project) successfully for quite some time. I don't know exactly when this issue started, but somehow the pod entered a restarting loop and the logs show the following error: msg="ManagedIdentityCredential: ManagedIdentityCredential: Get \"http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&client_id=..............&resource=https%3A%2F%2Fmanagement.core.windows.net%2F\": context deadline exceeded"

I am using the following Terraform code to deploy the chart:

resource "helm_release" "external_dns" {
  name       = "external-dns"
  chart      = "external-dns"
  repository = "https://charts.bitnami.com/bitnami"
  namespace  = kubernetes_namespace.main["external-dns"].metadata.0.name
  version    = "6.35.0"
  values     = ["${file("${path.module}/external_dns_value.yaml")}"]

  set {
    name  = "provider"
    value = "azure-private-dns"
  }

  set {
    name = "policy"
    value = "sync"
  }

  set {
    name  = "azure.tenantId"
    value = data.azurerm_client_config.current.tenant_id
  }

  set {
    name  = "azure.subscriptionId"
    value = data.azurerm_client_config.current.subscription_id
  }

  set {
    name  = "azure.resourceGroup"
    value = azurerm_resource_group.main.name
  }

  set {
    name = "azure.useManagedIdentityExtension"
    value = true
  }

  set {
    name = "azure.userAssignedIdentityID"
    value = module.aks.kubelet_identity_id
  }

  set {
    name  = "logLevel"
    value = "info"
  }

  set {
    name  = "logFormat"
    value = "text"
  }

  set {
    name  = "sources[0]"
    value = "service"
  }

  set {
    name = "sources[1]"
    value = "ingress"
  }

}

The file external_dns_value.yaml contains the following:

fqdnTemplates:
- '{{.Name}}.{{.Namespace}}.dns-test.loadbalancer.com'

I would appreciate it if anybody could help me with this issue.

Regards Shadi

### Tasks

Shadikho avatar Apr 09 '24 16:04 Shadikho

I'm experiencing the exact same error when deploying via kubectl apply.

WaitingForGuacamole avatar May 29 '24 17:05 WaitingForGuacamole

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Aug 27 '24 18:08 k8s-triage-robot