external-dns icon indicating copy to clipboard operation
external-dns copied to clipboard

AWS ALB Controller Frontend NLB and external-dns

Open CajuCLC opened this issue 9 months ago • 12 comments

What happened: When creating a frontend NLB using ingress, the external-dns is associating the A record to the ALB DNS and not to the NLB DNS.

What you expected to happen: external-dns should create the A record using the NLB DNS.

How to reproduce it (as minimally and precisely as possible): Enable Frontend NLB on Ingress using AWS ALB Controller: https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.13/guide/ingress/annotations/#enable-frontend-nlb

Make sure the ALB scheme is internal:

alb.ingress.kubernetes.io/scheme: internal

Environment:

  • External-DNS version (use external-dns --version): v0.18.0
  • DNS provider: Route 53

Additional notes

After some debugging I think I figured out the issue. This only happens when the ALB is internal. The internal- prefix is always added to the internal ALBs and not the NLBs.

Here is the external-dns log when the ALB is configured as internal:

time="2025-07-18T23:54:54Z" level=debug msg="Modifying endpoint: SUBDOMAIN.MYDOMAIN.COM 0 IN CNAME  internal-k8s-THIS-IS-MY-ALB.us-east-1.elb.amazonaws.com;k8s-THIS-IS-MY-nlb-0000000000.elb.us-east-1.amazonaws.com [], setting alias=true"

Notice both ALB;NLB domains: internal-k8s-THIS-IS-MY-ALB.us-east-1.elb.amazonaws.com;k8s-THIS-IS-MY-nlb-0000000000.elb.us-east-1.amazonaws.com

And here is the log when the ALB is internet-facing:

time="2025-07-18T23:54:54Z" level=debug msg="Modifying endpoint: SUBDOMAIN.MYDOMAIN.COM 0 IN CNAME  k8s-THIS-IS-MY-nlb-0000000000.elb.us-east-1.amazonaws.com;k8s-THIS-IS-MY-ALB.us-east-1.elb.amazonaws.com [], setting alias=true"

Notice both NLB;ALB domains: k8s-THIS-IS-MY-nlb-0000000000.elb.us-east-1.amazonaws.com;k8s-THIS-IS-MY-ALB.us-east-1.elb.amazonaws.com

It seems like external-dns tries to create using BOTH DNS, but Route 53 only accepts one, the first one. So when using internal ALB, that one appears first because of the alphabetical order.

Ideally external-dns should know that the A record should point to the NLB when the frontend NLB is enabled.

CajuCLC avatar Jul 18 '25 22:07 CajuCLC

Share kubernetes manifests with spec.status and all external-dns flags (not helm)

ivankatliarchuk avatar Jul 28 '25 08:07 ivankatliarchuk

Share kubernetes manifests with spec.status and all external-dns flags (not helm)

I can't share much information. But for the ingress these are the annotations:

    annotations:
      alb.ingress.kubernetes.io/enable-frontend-nlb: "true"
      alb.ingress.kubernetes.io/frontend-nlb-healthcheck-success-codes: "200,404"
      alb.ingress.kubernetes.io/frontend-nlb-healthcheck-protocol: HTTPS
      alb.ingress.kubernetes.io/frontend-nlb-healthcheck-port: "443"
      alb.ingress.kubernetes.io/frontend-nlb-scheme: internal
      alb.ingress.kubernetes.io/scheme: internal
      alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}]'
      alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:us-east-1:111122223333:certificate/XXXXXXXX-XXX-XXXX-XXXX-XXXXXXXXXXX
      alb.ingress.kubernetes.io/target-type: ip
      alb.ingress.kubernetes.io/backend-protocol: HTTPS
      alb.ingress.kubernetes.io/success-codes: 200,404

I used the Terraform EKS Addon, so these are the args by default:

    Args:
      --log-level=debug
      --log-format=text
      --interval=1m
      --source=service
      --source=ingress
      --policy=upsert-only
      --registry=txt
      --provider=aws

CajuCLC avatar Jul 28 '25 14:07 CajuCLC

Without spec.status is just guessing. Have you tried staging docker image?

ivankatliarchuk avatar Jul 28 '25 18:07 ivankatliarchuk

What spect.status do you need?

CajuCLC avatar Jul 28 '25 19:07 CajuCLC

When a manifest is applied, a controller (like the AWS Load Balancer Controller) modifies the spec.status https://stackoverflow.com/questions/48134304/what-is-the-meaning-of-status-value-in-kubernetes-manifest-file. This is where ExternalDNS reads annotations and spec.status values to make its decisions.

How ExternalDNS Works & What I Need

Ideally, you could share your manifests as-is (with sensitive values redacted), similar to the output from kubectl get svc <name> -o yaml > alb.yaml and same for nlb.

Alternatively, you could review the Go code at https://github.com/kubernetes-sigs/external-dns/blob/master/source/service.go and suggest a fix.

I'm currently unsure if this is a bug or a new feature request, it could be that external-dns already supports that scenario. If it's a new feature, none of us have AWS access right now to implement it unfortunately.

ivankatliarchuk avatar Jul 29 '25 08:07 ivankatliarchuk

I know what spec.status is. I was wondering why you need the status for a service when I mentioned I am using Ingress. So the loadBalancer will be empty:

apiVersion: v1
kind: Service
metadata:
  annotations:
    meta.helm.sh/release-name: XXXX
    meta.helm.sh/release-namespace: XXXX
  creationTimestamp: "2025-07-08T04:42:07Z"
  labels:
    app.kubernetes.io/managed-by: Helm
  name: XXXXXXX
  namespace: XXXX
  resourceVersion: "23843058"
  uid: XXXXXXXXXXXXXXXXXXXXXXXXX
spec:
  clusterIP: 172.20.112.207
  clusterIPs:
  - 172.20.112.207
  internalTrafficPolicy: Cluster
  ipFamilies:
  - IPv4
  ipFamilyPolicy: SingleStack
  ports:
  - name: http
    port: 80
    protocol: TCP
    targetPort: http
  selector:
    app.kubernetes.io/instance: XXXX
    app.kubernetes.io/name: XXXX-XXXX
  sessionAffinity: None
  type: ClusterIP
status:
  loadBalancer: {}

Now, when I check my ingress:

kubectl get ingress some-name -o jsonpath='{.status.loadBalancer.ingress[*].hostname}'
internal-k8s-some-domain.us-east-1.elb.amazonaws.com k8s-another-domain-nlb-123456789.elb.us-east-1.amazonaws.com

I am not a Go developer, but maybe this part of the code: https://github.com/kubernetes-sigs/external-dns/blob/master/source/ingress.go#L343-L356

And then it gets alphabetical order here: https://github.com/kubernetes-sigs/external-dns/blob/master/source/ingress.go#L181-L186

Again, I might be wrong because I never coded with Go. Soooo. :)

CajuCLC avatar Jul 29 '25 18:07 CajuCLC

Here is the ingress yaml:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    alb.ingress.kubernetes.io/backend-protocol: HTTP
    alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:us-east-1:111122223333:certificate/YYYYY-YYYY-YYYY-YYYY-YYYYYYYYYYYY
    alb.ingress.kubernetes.io/enable-frontend-nlb: "true"
    alb.ingress.kubernetes.io/frontend-nlb-healthcheck-port: "443"
    alb.ingress.kubernetes.io/frontend-nlb-healthcheck-protocol: HTTPS
    alb.ingress.kubernetes.io/frontend-nlb-healthcheck-success-codes: 200,404
    alb.ingress.kubernetes.io/frontend-nlb-scheme: internal
    alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}]'
    alb.ingress.kubernetes.io/scheme: internal
    alb.ingress.kubernetes.io/success-codes: 200,404
    alb.ingress.kubernetes.io/target-type: ip
    meta.helm.sh/release-name: yyyy
    meta.helm.sh/release-namespace: YYYYYY
  creationTimestamp: "2025-07-21T21:37:51Z"
  finalizers:
  - ingress.k8s.aws/resources
  generation: 1
  labels:
    app.kubernetes.io/instance: yyyy
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: yyyyyyyyyy-yyyyyyy
    app.kubernetes.io/version: 1.12.4
    helm.sh/chart: yyyyyyyyyy-yyyyyyy-0.1.0
  name: yyyyy-yyyyyyyyyy-yyyyyyy
  namespace: yyyyy-yyyy
  resourceVersion: "23777942"
  uid: yyyyyyyyy-yyyyy-yyyyy-yyyyy-yyyyyyyyyyyy
spec:
  ingressClassName: alb
  rules:
  - host: some.subdomain.mydomain.com
    http:
      paths:
      - backend:
          service:
            name: yyyyy-yyyyyyyyyy-yyyyyyy
            port:
              number: 80
        path: /somepath
        pathType: Exact
status:
  loadBalancer:
    ingress:
    - hostname: internal-k8s-some-domain.us-east-1.elb.amazonaws.com
    - hostname: k8s-another-domain-nlb-123456789.elb.us-east-1.amazonaws.com

CajuCLC avatar Jul 29 '25 18:07 CajuCLC

I'm currently facing the same issue and most likely the problem is that external dns wants to create an alias record with two entries and that fails because alias can only point to a single LB at a time.

When enabling frontend NLB, it seems to me that only the NLB should be reported in the ingress .status.loadBalancer.ingress or External DNS would need to be enhanced to deal with this and allow a filter to select which LB to pick when updating the DNS. This filter could be something like a regular expression filter where we can select the NLB or ALB base domain name.

cristicalin avatar Jul 30 '25 12:07 cristicalin

I'm currently facing the same issue and most likely the problem is that external dns wants to create an alias record with two entries and that fails because alias can only point to a single LB at a time.

When enabling frontend NLB, it seems to me that only the NLB should be reported in the ingress .status.loadBalancer.ingress or External DNS would need to be enhanced to deal with this and allow a filter to select which LB to pick when updating the DNS. This filter could be something like a regular expression filter where we can select the NLB or ALB base domain name.

Yep. I mentioned the logs and all in the original post. I don't think it should only report the NLB. I also think that by default the external-dns should use the NLB when enabled-frontend-nlb is true. And maybe have another annotation that you can specify the domain for the NLB, while the host is used for the ALB. Which in this case two records would be added. Not sure if that makes sense.

Anyways, one way to solve the issue for now is to name the LBs:

alb.ingress.kubernetes.io/load-balancer-name: "ak8s-rename-my-lbs"

As long as you use any letter that comes before i (from internal-), the NLB DNS will come first.

CajuCLC avatar Jul 30 '25 13:07 CajuCLC

It's not feasible to tie ingress to a specific implementation like the AWS Load Balancer controller, given the large number of ingress controllers available.

ivankatliarchuk avatar Aug 09 '25 09:08 ivankatliarchuk

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Nov 07 '25 09:11 k8s-triage-robot

When enable use-frontend-NLB annotation for ingress creation by ALB Controller. Two ingress hostnames will be visibility in status[]. the [0] is ALB hostname, and the [1] is NLB hostname, default behavior of external-dns alway use first item. So I have to make custom cronjob to get last item of status and manual update target annotation external dns

scila1996 avatar Nov 25 '25 08:11 scila1996