external-dns icon indicating copy to clipboard operation
external-dns copied to clipboard

TXT records created for aliases in AWS Route 53 have wrong record type prefix

Open seh opened this issue 3 years ago • 35 comments

What happened:

Using the "aws" provider to create DNS records for hostnames that point at AWS ELBs (such as for endpoints extracted from a Kubernetes Service or Ingress), since the hostnames don't parse as IP addresses, ExternalDNS considers the endpoints warrant a record of type CNAME. As the target hostname discovered from the Ingress's status sits within a canonical hosted zone, ExternalDNS decides that the record should be an alias to the target ELB's DNS record. Later, when composing the changes to send to the Route 53 service, ExternalDNS changes its mind and decides to use an A record instead. At that point, ExternalDNS leaves the endpoint.Endpoint's "RecordType" field's value as the original endpoint.RecordTypeCNAME ("CNAME").

That sets us up to create an A record for an endpoint.Endpoint that still represents a CNAME record. ExternalDNS then goes on to add the TXT ownership records to the change batch, and consults the endpoint.Endpoint's "RecordType" field, finding it to be "CNAME." This leads to a TXT record prefix of "cname-" even though it should probably be "a-" instead, if the goal is to have the TXT records indicate which of several primary records they describe.

What you expected to happen:

ExternalDNS will create a TXT record with a prefix indicating the same primary record type that the TXT record describes. In this case, since the primary record type created in Route 53 turns out to be A, I expect the TXT record's prefix to be "a-" instead of "cname-."

How to reproduce it (as minimally and precisely as possible):

In a Kubernetes cluster running within AWS EC2, create a Service of type "LoadBalancer," and allow ExternalDNS to discover the endpoint and its target by using either the "service" or "ingress" source.

Inspect the Route 53 service to see that ExternalDNS creates a primary record of type A, as an alias to the target AWS-hosted load balancer. Note too that ExternalDNS creates a TXT record with a prefix of "cname-" instead of "a-."

Anything else we need to know?:

In order to align the record type mentioned by these primary and TXT records, we need to make the TXT registry portion of ExternalDNS aware of the late decision that the AWS provider makes to use an A record instead. I am not sure whether other providers make similar overriding decisions when composing changes.

Environment:

  • External-DNS version: 0.12.0
  • DNS provider: aws (AWS Route 53)
  • Others: Source is Kubernetes Ingress

seh avatar Jul 21 '22 18:07 seh

Also faced with that issue. Thank you @seh for report.

doctornkz avatar Aug 10 '22 17:08 doctornkz

What I see is two guard records being produced; one with same name as 'A' record and one with 'cname-' prefix.

chonton avatar Sep 01 '22 00:09 chonton

That's odd. Does your "A" record's name happen to begin with "a-," inducing false aliasing?

seh avatar Sep 01 '22 12:09 seh

Version v0.12.2

Args

      containers:
      - args:
        - --log-level=info
        - --namespace=mis-feature
        - --publish-host-ip
        - --aws-batch-change-size=20
        - --domain-filter=mis.example.com
        - --interval=2m
        - --policy=upsert-only
        - --provider=aws
        - --source=ingress
        - --source=service
        - --registry=txt
        - --txt-owner-id=use-feature

Redacted Kubernetes Resources

---
apiVersion: v1
kind: Service
metadata:
  name: unified-theatre
  annotations:
    external-dns.alpha.kubernetes.io/alias: "true"
    external-dns.alpha.kubernetes.io/hostname: us.example.com
    external-dns.alpha.kubernetes.io/ingress-hostname-source: annotation-only
    external-dns.alpha.kubernetes.io/aws-weight: "255"
    external-dns.alpha.kubernetes.io/set-identifier: us-east-1
spec:
  type: ExternalName
  externalName: use.example.com
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: unified-region
  annotations:
    external-dns.alpha.kubernetes.io/alias: "true"
    external-dns.alpha.kubernetes.io/hostname: use.example.com
    external-dns.alpha.kubernetes.io/ingress-hostname-source: annotation-only

Redacted Route53 records

Record name Type Policy Weight Value/Route traffic to
cname-us-feature.mis.example.com TXT Weighted 255 "heritage=external-dns,external-dns/owner=use-feature,external-dns/resource=service/mis-feature/unified-theatre"
cname-use-feature.mis.example.com TXT Simple - "heritage=external-dns,external-dns/owner=use-feature,external-dns/resource=ingress/mis-feature/unified-region"
use.feature.mis.example.com A Simple - 10.93.177.118
us-feature.mis.example.com A Weighted 255 use-feature.mis.example.com.
us-feature.mis.example.com TXT Weighted 255 "heritage=external-dns,external-dns/owner=use-feature,external-dns/resource=service/mis-feature/unified-theatre"
use-feature.mis.example.com A Simple - internal-k8s-misfeatu-unifiedr-201835383a-1018808261.us-east-1.elb.amazonaws.com.
use-feature.mis.example.com TXT Simple - "heritage=external-dns,external-dns/owner=use-feature,external-dns/resource=ingress/mis-feature/unified-region"

chonton avatar Sep 01 '22 16:09 chonton

I'm also seeing this, and an additional problem is that when the k8s resource is deleted, the TXT record with prefix "cname-" is not deleted from route53. We have a zone with a large churn of resources and this resulted in reaching the limit on the number of records in the zone.

jwilf avatar Sep 27 '22 13:09 jwilf

I have the same problem. TXT records with prefix "cname-" are not deleted and cause an issue when I try to recreate k8s resources.

nicocout avatar Oct 14 '22 10:10 nicocout

We're seeing similar, but subtly different behavior: external-dns tries to delete cname- prefixed TXT records that were never created, failing the entire change batch and preventing all future updates until we manually intervene (by creating the record it wants to delete).

erikdeweerdt avatar Oct 20 '22 10:10 erikdeweerdt

+1 with the same problem in AWS. External DNS created a lot of entries in Route53 that start with CNAME-{{name}}.local TXT

dalvarezquiroga avatar Nov 14 '22 16:11 dalvarezquiroga

Having recently come across this issue, it appears part of the problem with the creation of erroneous cname- prefixed TXT records has to do with the construction of the plan struct and how that is then passed to the registry and onward to the cloud provider. The plan is comprised of create/update/delete arrays, and so the actual records have no association to each other insofar as the registry or the provider are aware. The changes to the endpoints are read and executed in order, resulting in AWS (or another supporting cloud provider) correctly recognizing a record identified as CNAME as an Alias, but still creating the cname- TXT record that was generated by the TXTRegistry in the prior stage of execution. The registry is aware of the provider, because it has to call the ApplyChanges function as part of its own. Barring a total overhaul of how aliases are handled, I wonder would it be possible to call a function from the registry level down to the provider to check for an alias e.g. AWSProviders useAlias function?

Gladdstone avatar Jan 11 '23 21:01 Gladdstone

+1 experiencing the same issue, while creation of A(Alias) records TXT record uses incorrect prefix (cname instead of a)

Same, highly annoying. I'm having to delete Route53 records on a daily basis for dozens of clusters in order for the controller to properly create all the relevant records and go healthy with "all records are up to date".

jbilliau-rcd avatar Mar 28 '23 17:03 jbilliau-rcd

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Jun 26 '23 17:06 k8s-triage-robot

/remove-lifecycle stale

aaroniscode avatar Jun 26 '23 17:06 aaroniscode

External-dns represents ALIAS records of type A to the planner as Endpoints of type CNAME and a ProviderSpecific attribute with key alias and value true. So it is an expected quirk that the new-format txt registry ownership records have a prefix of cname-. As the installed base has such ownership records, this would take an unreasonable amount of effort to change.

Problems with deletion would be separate bugs.

johngmyers avatar Jun 28 '23 05:06 johngmyers

So, will be any fix of that behaviour? I need to pin the tag version(0.11.1-debian-10-r27) due to this

stefkkkk avatar Dec 09 '23 07:12 stefkkkk

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Mar 08 '24 08:03 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Apr 07 '24 08:04 k8s-triage-robot

/remove-lifecycle rotten

jcogilvie avatar Apr 08 '24 14:04 jcogilvie

any updates?!

stefkkkk avatar Apr 09 '24 09:04 stefkkkk