external-dns icon indicating copy to clipboard operation
external-dns copied to clipboard

External DNS tries to create invalid TXT records for wildcard domains

Open OvervCW opened this issue 2 years ago • 9 comments

What happened:

I have a few ingresses with wildcard TLS entries and ExternalDNS seems to create the wrong TXT entries for these:

time="2022-08-01T13:10:44Z" level=info msg="Updating TXT record named 'a-*.example' to '\"heritage=external-dns,external-dns/owner=default,external-dns/resource=ingress/xxx/yyy\"' for Azure DNS zone 'example.com'."
time="2022-08-01T13:10:44Z" level=error msg="Failed to update TXT record named 'a-*.example' to '\"heritage=external-dns,external-dns/owner=default,external-dns/resource=ingress/xxx/yyy\"' for DNS zone 'example.com': dns.RecordSetsClient#CreateOrUpdate: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code=\"BadRequest\" Message=\"The domain name 'a-*.example.example.com' is invalid. The provided record set relative name 'a-*.example' is invalid.\""

What you expected to happen:

Not to see this error.

How to reproduce it (as minimally and precisely as possible):

I have ingresses that look like this:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-dns
    external-dns.alpha.kubernetes.io/ttl: "600"
    kubernetes.io/ingress.class: nginx
  name: yyy
  namespace: xxx
spec:
  rules:
  - host: yyy.example.example.com
    http:
      paths:
      - backend:
          service:
            name: yyy
            port:
              name: http
        path: /
        pathType: Prefix
  tls:
  - hosts:
    - '*.example.example.com'
    secretName: example.example.com-wildcard

Anything else we need to know?:

I can reproduce this problem in version 0.12.2, but not in 0.12.0.

Environment: Kubernetes 1.23.8

  • External-DNS version (use external-dns --version): 0.12.2
  • DNS provider: Azure
  • Others:

OvervCW avatar Aug 01 '22 13:08 OvervCW

My guess is that Azure DNS is just the stricter provider which doesn't allow the wildcard character to be anywhere except for the left most position. I don't see the problem with a-* TXT records on AWS or GCP. The reason why you see it in 0.12.2 and not in 0.12.0 is (I guess again) because 0.12.2 adds the missing TXT records of the new TXT format. @OvervCW: Was the ingress which gave the error existing before the migration to 0.12.2 took place?

alebedev87 avatar Aug 01 '22 15:08 alebedev87

@alebedev87 external-dns does some magic for wildcard records. I guess conversion should include that "magic"

k0da avatar Aug 01 '22 15:08 k0da

@k0da : Yes, there is this flag: --txt-wildcard-replacement. It works well with 0.12.2 - * gets replaced with the value given in the flag. As a matter of fact, I thought of suggesting this flag but first I wanted to confirm the fact that it's just Azure DNS being stricter.

alebedev87 avatar Aug 01 '22 15:08 alebedev87

@alebedev87 Yeah, the ingress already existed.

My guess is that Azure DNS is just the stricter provider which doesn't allow the wildcard character to be anywhere except for the left most position. I don't see the problem with a-* TXT records on AWS or GCP.

Does the spec actually allow such names or are AWS/GCP overly permissive?

OvervCW avatar Aug 01 '22 16:08 OvervCW

@k0da : Yes, there is this flag: --txt-wildcard-replacement. It works well with 0.12.2 - * gets replaced with the value given in the flag. As a matter of fact, I thought of suggesting this flag but first I wanted to confirm the fact that it's just Azure DNS being stricter.

makes sense

k0da avatar Aug 01 '22 16:08 k0da

I can confirm that using --txt-wildcard-replacement solved the problem for us. We set it to "wildcard" but I suppose the specific value does not really matter.

I think this should be mentioned as a breaking change for those who use the Azure provider in the changelog for 0.12.0 or 0.12.2.

OvervCW avatar Aug 02 '22 09:08 OvervCW

@OvervCW: nice to see that it worked out for you!

I think this should be mentioned as a breaking change for those who use the Azure provider in the changelog for 0.12.0 or 0.12.2.

That's not quite related to 0.12.2 version, it's a consequence of the new format of TXT records introduced in 0.12.0. You may see the same problem with 0.12.0 if external-dns would try to create a new DNS record - it would have to be accompanied by 2 TXT records one of which would be of a-* form.

@Raffo: how about adding this into the IMPORTANT NOTE section of 0.12.0 release? Azure DNS provider would require --txt-wildcard-replacement flag starting from 0.12.0.

alebedev87 avatar Aug 02 '22 12:08 alebedev87

Yes, I can do. We merged today the PR to get the final images tagged, I will update the release with the new info.

Raffo avatar Aug 02 '22 14:08 Raffo

@Raffo : thanks for 0.12.2 release!

Just wanted to say that I think that the note about --txt-wildcard-replacement is better to be added to 0.12.0 instead. As this limitation appears starting from 0.12.0. Anybody who will start from 0.12.0 may face the problem described in this issue with AzureDNS.

alebedev87 avatar Aug 02 '22 20:08 alebedev87

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Oct 31 '22 20:10 k8s-triage-robot

The --txt-wildcard-replacement should be given a sane default. I had a similar issue on PowerDNS.

@OvervCW can fix the issue they're experiencing by setting --txt-wildcard-replacement however a sane default and documentation of this should be created so other users do not experience similar issues.

CraigGardener avatar Nov 07 '22 19:11 CraigGardener

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Dec 07 '22 20:12 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-triage-robot avatar Jan 06 '23 21:01 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Jan 06 '23 21:01 k8s-ci-robot