external-dns icon indicating copy to clipboard operation
external-dns copied to clipboard

A record is getting Deleted and recreated on each run

Open aravindhkudiyarasan opened this issue 1 year ago • 8 comments

What happened:

external-dns is constantly deleting and re-adding A records of the ingress objects, even though there are no changes made. Only one instance of external-dns is running on EKS and managing the DNS records.

2023-10-05T09:54:31+05:30 time="2023-10-05T04:24:31Z" level=debug msg="Refreshing zones list cache"
2023-10-05T09:54:31+05:30 time="2023-10-05T04:24:31Z" level=debug msg="Considering zone: /hostedzone/Z1MVV30E1X1NPE (domain: example.com.)"
2023-10-05T09:54:31+05:30 time="2023-10-05T04:24:31Z" level=debug msg="Considering zone: /hostedzone/XXXX90XXXXX8WRFMCDKD (domain: dev.example.com.)"
2023-10-05T09:54:31+05:30 time="2023-10-05T04:24:31Z" level=debug msg="Adding route53-example-record.com. to zone dev.example.com. [Id: /hostedzone/XXXX90XXXXX8WRFMCDKD]"
2023-10-05T09:54:31+05:30 time="2023-10-05T04:24:31Z" level=debug msg="Adding prefix.route53-example-record.com. to zone dev.example.com. [Id: /hostedzone/XXXX90XXXXX8WRFMCDKD]"
2023-10-05T09:54:31+05:30 time="2023-10-05T04:24:31Z" level=debug msg="Adding prefix.cname-route53-example-record.com. to zone dev.example.com. [Id: /hostedzone/XXXX90XXXXX8WRFMCDKD]"
2023-10-05T09:54:31+05:30 time="2023-10-05T04:24:31Z" level=debug msg="Adding route53-example-record.com. to zone dev.example.com. [Id: /hostedzone/XXXX90XXXXX8WRFMCDKD]"
2023-10-05T09:54:31+05:30 time="2023-10-05T04:24:31Z" level=debug msg="Adding prefix.route53-example-record.com. to zone dev.example.com. [Id: /hostedzone/XXXX90XXXXX8WRFMCDKD]"
2023-10-05T09:54:31+05:30 time="2023-10-05T04:24:31Z" level=debug msg="Adding prefix.cname-route53-example-record.com. to zone dev.example.com. [Id: /hostedzone/XXXX90XXXXX8WRFMCDKD]"
2023-10-05T09:54:31+05:30 time="2023-10-05T04:24:31Z" level=info msg="Desired change: DELETE route53-example-record.com A [Id: /hostedzone/XXXX90XXXXX8WRFMCDKD]"
2023-10-05T09:54:31+05:30 time="2023-10-05T04:24:31Z" level=info msg="Desired change: DELETE prefix.route53-example-record.com TXT [Id: /hostedzone/XXXX90XXXXX8WRFMCDKD]"
2023-10-05T09:54:31+05:30 time="2023-10-05T04:24:31Z" level=info msg="Desired change: DELETE prefix.cname-route53-example-record.com TXT [Id: /hostedzone/XXXX90XXXXX8WRFMCDKD]"
2023-10-05T09:54:31+05:30 time="2023-10-05T04:24:31Z" level=info msg="Desired change: CREATE route53-example-record.com A [Id: /hostedzone/XXXX90XXXXX8WRFMCDKD]"
2023-10-05T09:54:31+05:30 time="2023-10-05T04:24:31Z" level=info msg="Desired change: CREATE prefix.route53-example-record.com TXT [Id: /hostedzone/XXXX90XXXXX8WRFMCDKD]"
2023-10-05T09:54:31+05:30 time="2023-10-05T04:24:31Z" level=info msg="Desired change: CREATE prefix.cname-route53-example-record.com TXT [Id: /hostedzone/XXXX90XXXXX8WRFMCDKD]"
2023-10-05T09:54:31+05:30 time="2023-10-05T04:24:31Z" level=info msg="6 record(s) in zone dev.example.com. [Id: /hostedzone/XXXX90XXXXX8WRFMCDKD] were successfully updated"

What you expected to happen:

No changes are made to the existing DNS records.

How to reproduce it (as minimally and precisely as possible):

Add following annotations in EKS service with type Loadbalancer.

  annotations:
    external-dns.alpha.kubernetes.io/alias: 'true'
    external-dns.alpha.kubernetes.io/aws-region: eu-central-1
    external-dns.alpha.kubernetes.io/hostname: route53-example-record.com
    service.beta.kubernetes.io/aws-load-balancer-additional-resource-tags: Environment=dev,Service=test
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-healthy-threshold: '2'
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-interval: '5'
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-path: /healthcheck
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-port: traffic-port
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-protocol: http
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-success-codes: '200'
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-timeout: '3'
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-unhealthy-threshold: '2'
    service.beta.kubernetes.io/aws-load-balancer-ip-address-type: ipv4
    service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
    service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing

Anything else we need to know?:

Environment:

External-DNS version (use external-dns --version): registry.k8s.io/external-dns/external-dns:v0.13.6, Helm chart 1.13.1. DNS provider: Route 53

aravindhkudiyarasan avatar Oct 05 '23 04:10 aravindhkudiyarasan

Same issue.

lukma99 avatar Oct 06 '23 10:10 lukma99

Any update on this bug ? We are using external-dns and external-dns is constantly deleting and re-adding A records of the ingress objects in the newer version.

aravindhkudiyarasan avatar Oct 26 '23 13:10 aravindhkudiyarasan

We're experiencing a similar issue after updating from 0.13.4 => 0.14.0, but only with a subset of records. A common thread among the domains external-dns misbehaves on, is that they all have NS records that are not managed by external-dns.

Log excerpt with example values:

time="2023-11-20T15:58:29Z" level=info msg="Change zone: domain-one-com batch #0"
time="2023-11-20T15:58:29Z" level=info msg="Del records: domain-one.com. A [70.203.58.243] 300"
time="2023-11-20T15:58:29Z" level=info msg="Del records: domain-one.com. TXT [\"heritage=external-dns,external-dns/owner=dns-frontend-prod-428b8056,external-dns/resource=ingress/traefik-ingresses/traefik-apps-domain-one.com\"] 300"
time="2023-11-20T15:58:29Z" level=info msg="Add records: domain-one.com. A [70.203.58.243] 300"
time="2023-11-20T15:58:29Z" level=info msg="Add records: domain-one.com. TXT [\"heritage=external-dns,external-dns/owner=dns-frontend-prod-428b8056,external-dns/resource=ingress/traefik-ingresses/traefik-apps-domain-one.com\"] 300"
time="2023-11-20T15:58:31Z" level=info msg="Change zone: domain-two-com batch #0"
time="2023-11-20T15:58:31Z" level=info msg="Del records: domain-two.com. A [70.203.58.243] 300"
time="2023-11-20T15:58:31Z" level=info msg="Del records: domain-two.com. TXT [\"heritage=external-dns,external-dns/owner=dns-frontend-prod-428b8056,external-dns/resource=ingress/traefik-ingresses/traefik-apps-domain-two\"] 300"
time="2023-11-20T15:58:31Z" level=info msg="Add records: domain-two.com. A [70.203.58.243] 300"
time="2023-11-20T15:58:31Z" level=info msg="Add records: domain-two.com. TXT [\"heritage=external-dns,external-dns/owner=dns-frontend-prod-428b8056,external-dns/resource=ingress/traefik-ingresses/traefik-apps-domain-two\"] 300"
time="2023-11-20T15:58:32Z" level=info msg="Change zone: domain-three-com batch #0"
time="2023-11-20T15:58:32Z" level=info msg="Del records: domain-three.com. A [70.203.58.243] 300"
time="2023-11-20T15:58:32Z" level=info msg="Del records: domain-three.com. TXT [\"heritage=external-dns,external-dns/owner=dns-frontend-prod-428b8056,external-dns/resource=ingress/traefik-ingresses/traefik-apps-domain-three\"] 300"
time="2023-11-20T15:58:32Z" level=info msg="Add records: domain-three.com. A [70.203.58.243] 300"
time="2023-11-20T15:58:32Z" level=info msg="Add records: domain-three.com. TXT [\"heritage=external-dns,external-dns/owner=dns-frontend-prod-428b8056,external-dns/resource=ingress/traefik-ingresses/traefik-apps-domain-three\"] 300"

paul-at-cybr avatar Nov 20 '23 16:11 paul-at-cybr

I had similar issue w/ this and I added the flag --txt-cache-interval=1h and it fixed the issue. Give it a try and see?

ckt114 avatar Jan 15 '24 22:01 ckt114

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Apr 14 '24 23:04 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar May 14 '24 23:05 k8s-triage-robot

This is still an issue /remove-lifecycle rotten

thecmdradama avatar May 15 '24 04:05 thecmdradama

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Aug 13 '24 04:08 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Sep 12 '24 04:09 k8s-triage-robot