Keeps incorrect DNS record info on failure, can't renew certs without restart.
With regards to https://github.com/caddy-dns/lego-deprecated/issues/10 and the latest caddy built with xcaddy:
I had incorrectly set a CNAME for a domain that required different API keys between the target domain and the canonical domain.
I corrected the issue by setting the appropriate A record instead.
Hours later (long after the TTL), I visited the domain and still got an SSL error. It seems the incorrect information was cached in caddy's memory and never reset.
How to reproduce
- Set a wildcard CNAME to a domain that cannot be controlled with the API Key
CNAME *.target-domain.example canonical-domain.example - Create the appropriate Caddyfile or JSON config
*.s.example.com { tls { issuer acme { dns <provider> <details> } } # ... } - Start caddy and get the cert error about can't set record on canonical-domain.example
- Change the record to
A,ANAME, orALIASand wait for the TTL to expireA *.example.com example.net - Observe that the cert error remains even though it should get fresh record info
- Restart and observe that the cert is immediately issued as expected
You mean the DNS record information was cached?
That's implemented by Go, which decides whether to use the system resolver or its own resolver. I guess if a process restart "fixes" it, it was using its own resolver which does have its own cache. I imagine it honors TTL though, and if not, perhaps a bug should be filed with Go.
I don't know at what level the caching happened, I just know that the certificate process doesn't have a "fresh start" as I would have expected it to, several hours after the TTL would have expired.
What would be a good way to tell if it's in caddy vs acmez vs lego vs Go's DNS?
I could run another test.
Probably strace will tell you the system calls being used -- depending on the platform you'd be looking for different things.
If there's absolutely no indication of a DNS lookup through strace then it's something in Go.
At which point I'd probably point at a goroutine stack dump, or maybe debugging from the outside by fiddling with external config like DNS records, network (online/offline) etc. Whatever surfaces the process might touch that just need to be wiggled to gain information.