letsencrypt-inwx
                                
                                 letsencrypt-inwx copied to clipboard
                                
                                    letsencrypt-inwx copied to clipboard
                            
                            
                            
                        dns-01 challenge hangs
Hi,
mostly the dns-01 challenge hangs for more than a minute, then I cancel it. When I randomly switch DNS server in the /etc/letsencrypt-inwx.json config (8.8.8.8, 9.9.9.9, 127.0.0.53), I sometimes get it working.
To understand the problem, which DNS server is letsencrypt using? Shouldn't the hook use the same server?
Ok maybe I wasn't pateient enough. Another try worked after 13 minutes. Is this expected?
Hi, by default letsencrypt-inwx waits until the created TXT record is publicly visible before it quits to make sure the record is already publicly visible. For doing this it queries the configured DNS server and checks the result. As far as I know the letsencrypt servers are directly querying the authoritative name servers but since this would introduce unnecessary complexity to this tool we're just using public recursive name servers. Usually this check takes between some seconds and 2 minutes of time but it can take longer if the public name server has cached the result which is the case if it was queried a moment ago.
You can disable this behavior with the no_dns_check option.
Thanks for the info. If I understand it correctly: wouldn't it be better to have 192.174.68.104 (ns.inwx.de) as the default DNS server instead of 8.8.8.8? Assuming most users don't change their SOA. I've just tried it with this server and it went much faster for me with 3m11s. Not much statistics yet, but I will create some more certs these days. Still wondering why other servers are so slow for me. (I'm only counting those tries which succeed the first time to avoid caching.)
Doesn't no_dns_check prevent letsencrypt from trying too early?
wouldn't it be better to have 192.174.68.104 (ns.inwx.de) as the default DNS server instead of 8.8.8.8?
I guess you're right and this would be a better default setting.
Doesn't
no_dns_checkprevent letsencrypt from trying too early?
Yes that's what the described procedure tries to accomplish. The alternative would be to enable no_dns_check and to specify a wait_interval manually. But since this is very error prone the dns check is the better solution.
Another change that could improve the challenge speed would be to change the time between dns checks. Currently it starts with a 5 seconds wait interval (that's not the same thing as in the configuration option: the configuration option specifies an additional interval to wait for after everything (including the dns check) is done) and doubles it on every try. I guess a static interval of 20 seconds would fit better here.
Ah, I see. I've made this behavior adjustable by adding some options:
no_dns_check: false
dns_check_timeout: 600
dns_check_interval: 20
dns_check_interval_multiplier: 1.0
wait_interval: 5
dns_server: 192.174.68.104
It makes no difference for me so far. I've just started another challenge and it takes already 10 minutes. What's interesting is that I can almost instantly see the newly created TXT record via:
dig txt _acme-challenge.DOMAIN @192.174.68.104
But the letsencrypt-inwx binary is not seeing it for 10 minutes, then it fails with timeout. Strangely, the challenge was successful. I will add some debugging info into the loop to see if the binary repeatedly don't see the record or if its DNS query is hanging/blocking.
Ok, it is getting really strange. I'm repeatedly running:
dig txt _acme-challenge.DOMAIN @192.174.68.104
And I'm getting no answer. I'm also repeatedly running:
dig txt _acme-challenge.DOMAIN @8.8.8.8
And I'm getting an answer for two tries in a row, then no answer, then an answer again.
Maybe I have an UDP related issue.
Stats update:
It's not blocking. DNS queries are indeed performed every 20 seconds. But the queries of the binary are never successful until running into timeout.
I've also changed the code to use TCP queries. It makes no difference. I've also seen that successful TCP queries with dig were followed by non-successful ones (No answer, NXDOMAIN).
For some reason the binary mostly never successfully queries the DNS record for me. Regardless of the configured DNS server. On the other hand, dig mostly is successful.
Ok, it is getting really strange. I'm repeatedly running:
dig txt _acme-challenge.DOMAIN @192.174.68.104And I'm getting no answer. I'm also repeatedly running:
dig txt _acme-challenge.DOMAIN @8.8.8.8And I'm getting an answer for two tries in a row, then no answer, then an answer again.
Maybe I have an UDP related issue.
I think this may be a result of 192.174.68.104 being an anycast ip according to an inwx help document.
DNS queries are indeed performed every 20 seconds. But the queries of the binary are never successful until running into timeout.
Are you sure about that? If the request would actually hit a timeout letsencrypt-inwx would return a non zero return code and the challenge would fail.
I think this may be a result of 192.174.68.104 being an anycast ip according to an inwx help document.
Ahh interesting, that would explain the inconsistencies. So, it seems that the different dns servers aren't synced yet and that I get different results when my requests get routed to different end servers.
Are you sure about that? If the request would actually hit a timeout letsencrypt-inwx would return a non zero return code and the challenge would fail.
Yes, tested again. Certbot notices that the hook returned with error code 1. But it doesn't stop the challenge from continuing. Then the next step is Waiting for verification... and since 5 minutes (I've reduced the timeout) have already passed, it will succeed. The return code seems to be ignored in the end. I'm using certbot 0.31.0, and have set the DNS server to ns2.inwx.de in the hope it is no anycast address (but that doesn't prevent letsencrypt from being routed to a different end server):
$ certbot certonly -n --agree-tos \
--rsa-key-size 4096 \
--email EMAIL \
--preferred-challenges=dns-01 \
--manual \
--manual-auth-hook /usr/lib/letsencrypt-inwx/certbot-inwx-auth \
--manual-cleanup-hook /usr/lib/letsencrypt-inwx/certbot-inwx-cleanup \
--manual-public-ip-logging-ok \
-d DOMAIN
Saving debug log to /var/log/letsencrypt/letsencrypt.log
Plugins selected: Authenticator manual, Installer None
Obtaining a new certificate
Performing the following challenges:
dns-01 challenge for DOMAIN
Hook command "/usr/lib/letsencrypt-inwx/certbot-inwx-auth" returned error code 1
Error output from certbot-inwx-auth:
[2019-07-07T16:42:15Z INFO  letsencrypt_inwx::cli] Creating TXT record...
[2019-07-07T16:42:15Z INFO  letsencrypt_inwx::cli] Using account ACCOUNT
[2019-07-07T16:42:17Z INFO  letsencrypt_inwx::cli] => done!
[2019-07-07T16:42:17Z INFO  letsencrypt_inwx::cli] Waiting for the dns record to be publicly visible...
[2019-07-07T16:42:17Z INFO  letsencrypt_inwx::cli] Querying dns record at dns server 176.97.158.104...
[2019-07-07T16:42:37Z INFO  letsencrypt_inwx::cli] Querying dns record at dns server 176.97.158.104...
[2019-07-07T16:42:57Z INFO  letsencrypt_inwx::cli] Querying dns record at dns server 176.97.158.104...
[2019-07-07T16:43:17Z INFO  letsencrypt_inwx::cli] Querying dns record at dns server 176.97.158.104...
[2019-07-07T16:43:37Z INFO  letsencrypt_inwx::cli] Querying dns record at dns server 176.97.158.104...
[2019-07-07T16:43:57Z INFO  letsencrypt_inwx::cli] Querying dns record at dns server 176.97.158.104...
[2019-07-07T16:44:17Z INFO  letsencrypt_inwx::cli] Querying dns record at dns server 176.97.158.104...
[2019-07-07T16:44:37Z INFO  letsencrypt_inwx::cli] Querying dns record at dns server 176.97.158.104...
[2019-07-07T16:44:57Z INFO  letsencrypt_inwx::cli] Querying dns record at dns server 176.97.158.104...
[2019-07-07T16:45:17Z INFO  letsencrypt_inwx::cli] Querying dns record at dns server 176.97.158.104...
[2019-07-07T16:45:37Z INFO  letsencrypt_inwx::cli] Querying dns record at dns server 176.97.158.104...
[2019-07-07T16:45:57Z INFO  letsencrypt_inwx::cli] Querying dns record at dns server 176.97.158.104...
[2019-07-07T16:46:17Z INFO  letsencrypt_inwx::cli] Querying dns record at dns server 176.97.158.104...
[2019-07-07T16:46:37Z INFO  letsencrypt_inwx::cli] Querying dns record at dns server 176.97.158.104...
[2019-07-07T16:46:57Z INFO  letsencrypt_inwx::cli] Querying dns record at dns server 176.97.158.104...
[2019-07-07T16:47:17Z ERROR letsencrypt_inwx::cli] => timeout!
Waiting for verification...
Cleaning up challenges
Error output from certbot-inwx-cleanup:
[2019-07-07T16:47:21Z INFO  letsencrypt_inwx::cli] Deleting TXT record...
[2019-07-07T16:47:21Z INFO  letsencrypt_inwx::cli] Using account ACCOUNT
[2019-07-07T16:47:22Z INFO  letsencrypt_inwx::cli] => done!
IMPORTANT NOTES:
 - Congratulations! Your certificate and chain have been saved at:
   /etc/letsencrypt/live/DOMAIN/fullchain.pem
   Your key file has been saved at:
   /etc/letsencrypt/live/DOMAIN/privkey.pem
   Your cert will expire on 2019-10-05. To obtain a new or tweaked
   version of this certificate in the future, simply run certbot
   again. To non-interactively renew *all* of your certificates, run
   "certbot renew"
 - If you like Certbot, please consider supporting our work by:
   Donating to ISRG / Let's Encrypt:   https://letsencrypt.org/donate
   Donating to EFF:                    https://eff.org/donate-le
Of course... I picked the only other anycast address left :D
I have chosen 46.165.212.97 now and suddenly it works. How can it be that trust-dns isn't working with these two anycast addresses? I've checked if the query gives a ClientError, but it is successful. Maybe the response is differently shaped and the code walking through the response data overlooks it.
Strangely again, dig is no more working for me with this new address, it gives NXDOMAIN. Maybe there is still a not yet synced load balancing cluster behind this address routing dig requests differently... Or I have just bad luck today.