external-dns icon indicating copy to clipboard operation
external-dns copied to clipboard

New TXT record breaks downward compatibility by retroactively limiting record length

Open mgruener opened this issue 3 years ago • 1 comments

What happened:

The change to the TXT registry introduced in #2157 breaks downward compatibility by retroactively limiting the maximum length of managed records. This means even with #2811 implemented, there will be cases where an upgrade from pre 0.12.0 to 0.12.0 will not be possible.

As adressed in #2816, the maximum length of a record is 63 characters. This also holds true for the registry TXT. To avoid CNAME conflicts, it is already necessary to use a prefix or suffix for the TXT records, limiting the maximum length of managed records to 62 characters in the best case (and even less if the suffix/prefix is longer than one character).

Adding the record type to the TXT record reduces the maximum length of the managed record to 63 - [prefix/suffix] - [record type]. This breaks downward compatibility with all setups with already existing records with longer record names than 63 - [prefix/suffix] - [record type], without an option for the user to change this behavior. If such records exist, creating the TXT record that includes the record type will fail.

Even without the whole backwards compatibility issue: #2157 adds a limiting factor that will pretty much seem random to end users, as the implementation results in a situation where for example A records can be longer than CNAME records.

What you expected to happen:

external-dns providing an option to disable the creation of the record type TXT record or (if this is the only remaining TXT record type in the future) an option to disable adding the record type to the registry TXT record.

How to reproduce it (as minimally and precisely as possible):

Use external-dns < 0.12.0, use a single character txt-suffix (for example "-"), add a cname record with 62 characters (for example "thisisarecordwithareallyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyylongname.some.domain") and then upgrade to external-dns 0.12.0 and trigger a reconcile.

This should result in external-dns trying to create the TXT records "thisisarecordwithareallyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyylongname-.some.domain" and "cname-thisisarecordwithareallyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyylongname-.some.domain", with the latter failing as it viloates RFC1035.

Anything else we need to know?:

Environment:

  • External-DNS version: v0.12.0
  • DNS provider: AWS
  • Others: TXT registry

mgruener avatar Jun 22 '22 12:06 mgruener

We set txt-prefix to be a subdomain of the record being created (e.g. myprefix.) so that all the metadata can be contained in a different label of the DNS name. Whilst this does still limit the overall length of a domain name, it ensures users can still have labels with up to 63 characters (Which is the more likely limit we would be to hit vs overall length of 254).

Since the new record type is appended after txt-prefix the metadata is no longer solely contained in a separate DNS label and now encroaches on the user's limits for their DNS name

I understand %{record_type} can be used to place the record type in the desired location, however, I don't see a migration path for that

Evesy avatar Aug 04 '22 09:08 Evesy

We were just hit by this too, as the safeties we have in place to make sure our DNS names are not too long is no longer correct due to this new max length limitation.

What makes it worse for us is any error in updating records causes no records to be updated, so it just takes one bad DNS name to break DNS updates for everyone.

Pondidum avatar Aug 29 '22 09:08 Pondidum

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Nov 27 '22 10:11 k8s-triage-robot

We are also experiencing this. Is there any workaround?

cep21 avatar Dec 09 '22 21:12 cep21

/remove-lifecycle stale

cep21 avatar Dec 09 '22 21:12 cep21

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Mar 09 '23 22:03 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Apr 08 '23 22:04 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-triage-robot avatar May 08 '23 22:05 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar May 08 '23 22:05 k8s-ci-robot

/reopen

jullianow avatar Aug 30 '23 16:08 jullianow

@jullianow: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Aug 30 '23 16:08 k8s-ci-robot

/remove-lifecycle rotten

rodolphobarbosa avatar Aug 31 '23 19:08 rodolphobarbosa

/reopen

rodolphobarbosa avatar Aug 31 '23 19:08 rodolphobarbosa

@rodolphobarbosa: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Aug 31 '23 19:08 k8s-ci-robot

Are there any updates? I've observed that even with external-dns 0.14.0 one bad DNS name still causes external-dns to crash loop.

yurrriq avatar Feb 29 '24 20:02 yurrriq

I still have this problem too, even with the latest version.

jullianow avatar Feb 29 '24 21:02 jullianow

I have same problem here.

Is it possible to change the prefix to use . instead of -.

For aws alias can have a cname.the-original-domain-name.something.com and for azure, it can have a.the-original-domain-name.something.com So that the txt record would have the same domain label. If the original domain was created successfully, the txt record will also be created without issue.

Otherwise, for current status: If the original domain is 63 characters long, after adding the cname- prefix, it'll be failed when creating the new TXT record. Next time when there is any changes on the record, external-dns may not be able to recognize the a record. It may fail to update it.

wangshu3000 avatar Mar 13 '24 01:03 wangshu3000