lego icon indicating copy to clipboard operation
lego copied to clipboard

acme-dns: CNAME and DNS Zones cause dns-01 challenge to fail due to bad propagation check

Open a-gerhard opened this issue 2 years ago • 4 comments

Welcome

  • [X] Yes, I'm using a binary release within 2 latest releases.
  • [X] Yes, I've searched similar issues on GitHub and didn't find any.
  • [X] Yes, I've included all information below (version, config, etc).

What did you expect to see?

_acme-challenge.domain.to.verify points to acme-dnsuser-account.acmedns.other.dns.provider. admedns.other.dns.provider has NS record set to internal.acmedns.other.dns.provider and at the other-dns-provider name server, internal.acmedsn.other.dns.provider A-record is known and points to our acme-dns instance. Using dig to resolve the acme-challenge TXT records is successful in this setup, from inside the traefik container (default nameserver) as well as from my macbook. We use this setup with acme.sh (in conjuction with the same acme-dns server), and this works perfectly.

I would expect lego to work as well.

What did you see instead?

lego doesn't seem to realize that the dns entry has been propagated. Judging from the error message time limit exceeded: last error: NS ns3.inwx.eu. did not return the expected TXT record, it seems to be querying the authoritative name server of our second domain (other.dns.provider), rather than the authoritative name server (i.e., our acme-dns server) of the subdomain acmedns.other.dns.provider. Using dig to query the TXT record of acme-dnsuser-account.acmedns.other.dns.provider at this nameserver does indeed fail.

How do you use lego?

Through Traefik

Reproduction steps

  1. Have _acme-challenge domain be a CNAME pointing to a domain registered and served at another provider
  2. Have the CNAME actually point to a zone on that other domain provider that is hosted at a third provider (self-hosted)
  3. The second domain provider does not take into account the fact that the subzone has a different nameserver and tries to serve the TXT record directly but fails

Version of lego

traefik version 2.8.4

Logs

time="2022-09-05T16:21:00Z" level=debug msg="legolog: [INFO] [domain.to.verify] acme: Obtaining bundled SAN certificate"
time="2022-09-05T16:21:00Z" level=debug msg="legolog: [INFO] [domain.to.verify] AuthURL: https://acme-v02.api.letsencrypt.org/acme/authz-v3/150062301727"
time="2022-09-05T16:21:00Z" level=debug msg="legolog: [INFO] [domain.to.verify] acme: Could not find solver for: tls-alpn-01"
time="2022-09-05T16:21:00Z" level=debug msg="legolog: [INFO] [domain.to.verify] acme: Could not find solver for: http-01"
time="2022-09-05T16:21:00Z" level=debug msg="legolog: [INFO] [domain.to.verify] acme: use dns-01 solver"
time="2022-09-05T16:21:00Z" level=debug msg="legolog: [INFO] [domain.to.verify] acme: Preparing to solve DNS-01"
time="2022-09-05T16:21:01Z" level=debug msg="legolog: [INFO] [domain.to.verify] acme: Trying to solve DNS-01"
time="2022-09-05T16:21:01Z" level=debug msg="legolog: [INFO] [domain.to.verify] acme: Checking DNS record propagation using [127.0.0.11:53]"
time="2022-09-05T16:21:03Z" level=debug msg="legolog: [INFO] Wait for propagation [timeout: 1m0s, interval: 2s]"
time="2022-09-05T16:21:03Z" level=debug msg="Delaying 1000000000 rather than validating DNS propagation now." providerName=acmedns.acme
time="2022-09-05T16:21:05Z" level=debug msg="legolog: [INFO] [domain.to.verify] acme: Waiting for DNS record propagation."
time="2022-09-05T16:21:07Z" level=debug msg="Delaying 1000000000 rather than validating DNS propagation now." providerName=acmedns.acme
time="2022-09-05T16:21:08Z" level=debug msg="legolog: [INFO] [domain.to.verify] acme: Waiting for DNS record propagation."
time="2022-09-05T16:21:10Z" level=debug msg="Delaying 1000000000 rather than validating DNS propagation now." providerName=acmedns.acme
time="2022-09-05T16:21:11Z" level=debug msg="legolog: [INFO] [domain.to.verify] acme: Waiting for DNS record propagation."
time="2022-09-05T16:21:13Z" level=debug msg="Delaying 1000000000 rather than validating DNS propagation now." providerName=acmedns.acme
time="2022-09-05T16:21:15Z" level=debug msg="legolog: [INFO] [domain.to.verify] acme: Waiting for DNS record propagation."
time="2022-09-05T16:22:02Z" level=debug msg="legolog: [INFO] [domain.to.verify] acme: Waiting for DNS record propagation."
time="2022-09-05T16:22:04Z" level=debug msg="legolog: [INFO] [domain.to.verify] acme: Cleaning DNS-01 challenge"
time="2022-09-05T16:22:05Z" level=debug msg="legolog: [INFO] Deactivating auth: https://acme-v02.api.letsencrypt.org/acme/authz-v3/150062301727"
time="2022-09-05T16:22:05Z" level=error msg="Unable to obtain ACME certificate for domains \"domain.to.verify\": unable to generate a certificate for the domains [domain.to.verify]: error: one or more domains had a problem:\n[domain.to.verify] time limit exceeded: last error: NS ns2.inwx.de. did not return the expected TXT record [fqdn: acme-dnsuser-account.acmedns.other.dns.provider, value: tokenvalue]: \n" rule="Host(`domain.to.verify`)" ACME CA="https://acme-v02.api.letsencrypt.org/directory" providerName=acmedns.acme routerName=someservice@docker

Go environment (if applicable)

No response

a-gerhard avatar Sep 05 '22 17:09 a-gerhard

I do not think this is a duplicate of #1680. Which this one does involve a CNAME as well, in my understanding this is a DNS zone issue; the CNAME seems to be handled correctly.

a-gerhard avatar Sep 05 '22 17:09 a-gerhard

Hello,

have you enabled the CNAME experimental support LEGO_EXPERIMENTAL_CNAME_SUPPORT?

ldez avatar Sep 05 '22 17:09 ldez

Yes, I did it via env in the docker container.

$ docker exec -it traefik sh
/ # echo $LEGO_EXPERIMENTAL_CNAME_SUPPORT
true

a-gerhard avatar Sep 05 '22 18:09 a-gerhard

@a-gerhard if I create a PR, are you able to test it?

ldez avatar Sep 17 '22 16:09 ldez

@ldez I can try

a-gerhard avatar Sep 30 '22 13:09 a-gerhard

The build of lego is very simple: you need git, Go, and make.

I will ping you when the PR will be ready.

related to #1119

ldez avatar Sep 30 '22 14:09 ldez

I have the same problem and a similar setup as pointed out above. (CNAME that points to the subdomain of the acme-dns server) Setting LEGO_EXPERIMENTAL_CNAME_SUPPORT in the traefik container does not help. time limit exceeded: last error: NS ns1.astrastudio.de. did not return the expected TXT record

saschaludwig avatar Oct 26 '22 02:10 saschaludwig

I don't know why, but I just add two things before it's working.

  • Set an enironment variable LEGO_EXPERIMENTAL_CNAME_SUPPORT=true (I don't know if this is relevant, the config that make mine work is the below argument)
  • add this argument --dns.disable-cp

Zen3515 avatar Nov 11 '22 15:11 Zen3515

I'm having the same issue since updating from 4.8.0 to 4.9.0, I can test a possible fix if needed.

rien avatar Nov 14 '22 13:11 rien

I'm having the same issue since updating from 4.8.0 to 4.9.0, I can test a possible fix if needed.

Hi, i found this temporary and unofficial workaround : https://github.com/go-acme/lego/issues/1754#issuecomment-1322326705

I hope this options will be implemented. I use this envar with traefik 2.9.4 too, because it implement lego 4.9 now.

This help me to have fully working certs renewal, with latest versions of lego and traefik without downgrade.

dginhoux avatar Nov 22 '22 07:11 dginhoux

Sorry for the late feedback, I created PR #1847, can someone test it?

ldez avatar Feb 22 '23 14:02 ldez

I think I made a wrong assumption about this issue: I was thinking that was just a CNAME issue.

FYI, the server used to handle the propagation check is not related to the domain or NS for this domain, it's a global option --resolvers. Also, acme.sh doesn't check the propagation before asking to Let's Encrypt.

So will revert my fix.

ldez avatar Mar 03 '23 08:03 ldez

This fix does seem to have solved the DNs propagation error on my server though.

rien avatar Mar 03 '23 09:03 rien

I think there is confusion here: this issue is about a specific "DNS provider" called acme-dns (Joohoi’s ACME-DNS) and the local propagation check done by lego itself.

The fix was not related to the propagation check and it didn't have any effect on the propagation check.

ldez avatar Mar 03 '23 11:03 ldez

Ah yes, this seems to be the wrong issue then. My apologies.

rien avatar Mar 03 '23 12:03 rien