route53
route53 copied to clipboard
Unable to pass DNS challenge with Caddy 2.8+
Wildcard DNS challenge stopped working after update to Caddy 2.8.
The minimum reproducible setup:
Caddy config:
{
storage consul {
prefix "caddytls"
}
admin :2019
debug
email [email protected]
}
*.example.com {
log {
format json
}
tls {
dns route53
}
}
Dockerfile:
FROM --platform=linux/amd64 caddy:2-builder-alpine@sha256:cdf3364f8cb02338b857728fdc0a9b8875b343996db347300bf2361db3da9094 AS builder
RUN xcaddy build \
--with github.com/pteich/caddy-tlsconsul \
--with github.com/caddy-dns/route53
FROM --platform=linux/amd64 caddy:2-alpine@sha256:a48e22edad925dc216fd27aa4f04ec49ebdad9b64c9e5a3f1826d0595ef2993c
COPY --from=builder /usr/bin/caddy /usr/bin/caddy
Logs:
{"level":"info","ts":1717682068.1877885,"logger":"tls.obtain","msg":"lock acquired","identifier":"*.example.com"}
{"level":"info","ts":1717682068.1907144,"logger":"tls.obtain","msg":"obtaining certificate","identifier":"*.example.com"}
{"level":"debug","ts":1717682068.1908574,"logger":"events","msg":"event","name":"cert_obtaining","id":"60de8b42-ab04-4b13-9920-03713277aa4a","origin":"tls","data":{"identifier":"*.example.com"}}
{"level":"debug","ts":1717682068.1911874,"logger":"tls.obtain","msg":"trying issuer 1/1","issuer":"acme-v02.api.letsencrypt.org-directory"}
{"level":"debug","ts":1717682068.191264,"logger":"caddy.storage.consul","msg":"loading data from Consul for acme/acme-v02.api.letsencrypt.org-directory/users/[email protected]/caddy.json"}
{"level":"debug","ts":1717682068.1937697,"logger":"caddy.storage.consul","msg":"loading data from Consul for acme/acme-v02.api.letsencrypt.org-directory/users/[email protected]/caddy.key"}
{"level":"info","ts":1717682068.1980238,"logger":"tls.issuance.acme","msg":"waiting on internal rate limiter","identifiers":["*.example.com"],"ca":"https://acme-v02.api.letsencrypt.org/directory","account":"[email protected]"}
{"level":"info","ts":1717682068.198052,"logger":"tls.issuance.acme","msg":"done waiting on internal rate limiter","identifiers":["*.example.com"],"ca":"https://acme-v02.api.letsencrypt.org/directory","account":"[email protected]"}
{"level":"info","ts":1717682068.1981454,"logger":"tls.issuance.acme","msg":"using ACME account","account_id":"https://acme-v02.api.letsencrypt.org/acme/acct/1763210887","account_contact":["mailto:[email protected]"]}
{"level":"debug","ts":1717682068.400449,"logger":"tls.issuance.acme.acme_client","msg":"http request","method":"GET","url":"https://acme-v02.api.letsencrypt.org/directory","headers":{"User-Agent":["Caddy/2.8.4 CertMagic acmez (linux; amd64)"]},"response_headers":{"Cache-Control":["public, max-age=0, no-cache"],"Content-Length":["746"],"Content-Type":["application/json"],"Date":["Thu, 06 Jun 2024 13:54:28 GMT"],"Server":["nginx"],"Strict-Transport-Security":["max-age=604800"],"X-Frame-Options":["DENY"]},"status_code":200}
{"level":"debug","ts":1717682068.400676,"logger":"tls.issuance.acme.acme_client","msg":"creating order","account":"https://acme-v02.api.letsencrypt.org/acme/acct/1763210887","identifiers":["*.example.com"]}
{"level":"debug","ts":1717682068.4561968,"logger":"tls.issuance.acme.acme_client","msg":"http request","method":"HEAD","url":"https://acme-v02.api.letsencrypt.org/acme/new-nonce","headers":{"User-Agent":["Caddy/2.8.4 CertMagic acmez (linux; amd64)"]},"response_headers":{"Cache-Control":["public, max-age=0, no-cache"],"Date":["Thu, 06 Jun 2024 13:54:28 GMT"],"Link":["<https://acme-v02.api.letsencrypt.org/directory>;rel=\"index\""],"Replay-Nonce":["su1caOmbBxQwQu9hLgYH8tMvuXSY0yd8jUjEqWyqAihX7TMZGos"],"Server":["nginx"],"Strict-Transport-Security":["max-age=604800"],"X-Frame-Options":["DENY"]},"status_code":200}
{"level":"debug","ts":1717682068.5403905,"logger":"tls.issuance.acme.acme_client","msg":"http request","method":"POST","url":"https://acme-v02.api.letsencrypt.org/acme/new-order","headers":{"Content-Type":["application/jose+json"],"User-Agent":["Caddy/2.8.4 CertMagic acmez (linux; amd64)"]},"response_headers":{"Boulder-Requester":["1763210887"],"Cache-Control":["public, max-age=0, no-cache"],"Content-Length":["345"],"Content-Type":["application/json"],"Date":["Thu, 06 Jun 2024 13:54:28 GMT"],"Link":["<https://acme-v02.api.letsencrypt.org/directory>;rel=\"index\""],"Location":["https://acme-v02.api.letsencrypt.org/acme/order/1763210887/275980118617"],"Replay-Nonce":["su1caOmb2AuTy7-eFJ7SHv1wOCyVgybSNdoJKeGjNcwOLeTGn7k"],"Server":["nginx"],"Strict-Transport-Security":["max-age=604800"],"X-Frame-Options":["DENY"]},"status_code":201}
{"level":"debug","ts":1717682068.600306,"logger":"tls.issuance.acme.acme_client","msg":"http request","method":"POST","url":"https://acme-v02.api.letsencrypt.org/acme/authz-v3/360439255817","headers":{"Content-Type":["application/jose+json"],"User-Agent":["Caddy/2.8.4 CertMagic acmez (linux; amd64)"]},"response_headers":{"Boulder-Requester":["1763210887"],"Cache-Control":["public, max-age=0, no-cache"],"Content-Length":["391"],"Content-Type":["application/json"],"Date":["Thu, 06 Jun 2024 13:54:28 GMT"],"Link":["<https://acme-v02.api.letsencrypt.org/directory>;rel=\"index\""],"Replay-Nonce":["su1caOmbKx5cQpgNcP62Uc4bXmQr1rpUrDLGB9LmzmTeSj7AokU"],"Server":["nginx"],"Strict-Transport-Security":["max-age=604800"],"X-Frame-Options":["DENY"]},"status_code":200}
{"level":"info","ts":1717682068.6005263,"logger":"tls.issuance.acme.acme_client","msg":"trying to solve challenge","identifier":"*.example.com","challenge_type":"dns-01","ca":"https://acme-v02.api.letsencrypt.org/directory"}
{"level":"error","ts":1717682068.6307743,"logger":"tls.issuance.acme.acme_client","msg":"cleaning up solver","identifier":"*.example.com","challenge_type":"dns-01","error":"no memory of presenting a DNS record for \"_acme-challenge.example.com\" (usually OK if presenting also failed)"}
{"level":"debug","ts":1717682068.6949975,"logger":"tls.issuance.acme.acme_client","msg":"http request","method":"POST","url":"https://acme-v02.api.letsencrypt.org/acme/authz-v3/360439255817","headers":{"Content-Type":["application/jose+json"],"User-Agent":["Caddy/2.8.4 CertMagic acmez (linux; amd64)"]},"response_headers":{"Boulder-Requester":["1763210887"],"Cache-Control":["public, max-age=0, no-cache"],"Content-Length":["395"],"Content-Type":["application/json"],"Date":["Thu, 06 Jun 2024 13:54:28 GMT"],"Link":["<https://acme-v02.api.letsencrypt.org/directory>;rel=\"index\""],"Replay-Nonce":["su1caOmbzRAm8TrBKvAcq-Lm4Xi-o3g5q22uZzpGo6jRk7hundE"],"Server":["nginx"],"Strict-Transport-Security":["max-age=604800"],"X-Frame-Options":["DENY"]},"status_code":200}
{"level":"error","ts":1717682068.696349,"logger":"tls.obtain","msg":"could not get certificate from issuer","identifier":"*.example.com","issuer":"acme-v02.api.letsencrypt.org-directory","error":"[*.example.com] solving challenges: presenting for challenge: adding temporary record for zone \"example.com.\": not found, ResolveEndpointV2 (order=https://acme-v02.api.letsencrypt.org/acme/order/1763210887/275980118617) (ca=https://acme-v02.api.letsencrypt.org/directory)"}
{"level":"debug","ts":1717682068.6964366,"logger":"events","msg":"event","name":"cert_failed","id":"8bf8efb3-0aa5-4e63-8478-33cf4bb9906a","origin":"tls","data":{"error":{},"identifier":"*.example.com","issuers":["acme-v02.api.letsencrypt.org-directory"],"renewal":false}}
{"level":"error","ts":1717682068.6964548,"logger":"tls.obtain","msg":"will retry","error":"[*.example.com] Obtain: [*.example.com] solving challenges: presenting for challenge: adding temporary record for zone \"example.com.\": not found, ResolveEndpointV2 (order=https://acme-v02.api.letsencrypt.org/acme/order/1763210887/275980118617) (ca=https://acme-v02.api.letsencrypt.org/directory)","attempt":1,"retrying_in":60,"elapsed":0.508639999,"max_duration":2592000}
Everything pass fine with Caddy 2.7.6.
Any suggestions are appreciated.
Same issue here. I tried re-issuing my AWS keys, but AWS is reporting that they are "not used". I think for some reason it is not presenting the auth.
I am wondering if we just need to bump the caddy version since there were so many breaking changes
https://github.com/caddy-dns/route53/blob/8e49e7546771bf6846e1531dcaff4925af5ddcde/go.mod#L6
It looks like it is related to this issue: https://github.com/libdns/route53/issues/235#issue-2212746183
Which is related to this issue: https://github.com/aws/aws-sdk-go-v2/issues/2370#issuecomment-1953308268
Ran into the same issue with a single individual domain, not wildcard. The fix mentioned here that ryantiger685 mentions worked for me. Looks like PRs in that repository need to get merged to fix this officially.
Edit: Just tested wildcard and that's working with this fix as well.
Just ran into this as well after upgrading Caddy to v2.8.4.
Could you test this with the latest version and wait_for_propagation enabled?
{
"module": "acme",
"challenges": {
"dns": {
"provider": {
"name": "route53",
"wait_for_propagation": true,
}
}
}
}
FWIW, I'm using a Dockerfile to build https://github.com/lucaslorentz/caddy-docker-proxy with this plugin, and simply rebuilding the container with the latest release of this plugin and Caddy 2.8.4 was enough to solve the DNS challenge problem described in this thread, although I am not using a wildcard domain. I did not need to use the wait_for_propagation parameter.
Could you test this with the latest version and
wait_for_propagationenabled?{ "module": "acme", "challenges": { "dns": { "provider": { "name": "route53", "wait_for_propagation": true, } } } }
Yes, this works! Just tested with a new domain. Feels good removing all the hacks :)
This may be unrelated but just to note, I did get a new error from Route 53: Invalid Configuration: Missing Region
I just added us-east-1 as the region value and the error went away and everything works! Just thought I'd mention that this parameter may be required now.
~~Ah sorry, I spoke too soon.~~ The normal domain worked but the wildcard domain did not.
{
"level": "error",
"ts": 1719515037.2461495,
"logger": "tls.obtain",
"msg": "will retry",
"error": "[*.stage.foo.bar.com] Obtain: [*.stage.foo.bar.com] solving challenges: presenting for challenge: adding temporary record for zone \"foo.bar.com.\": exceeded max wait time for ResourceRecordSetsChanged waiter (order=https://acme-staging-v02.api.letsencrypt.org/acme/order/152473533/17457386443) (ca=https://acme-staging-v02.api.letsencrypt.org/directory)",
"attempt": 4,
"retrying_in": 300,
"elapsed": 546.902648806,
"max_duration": 2592000
}
Edit:
I manually deleted the TXT record from Route 53, restarted Caddy, and the wildcard domain works! Not sure what happened here the first time but might just have been something on my end.
I saw that these two are the first errors which led me to do the extra troubleshooting:
{
"level": "error",
"ts": 1719514555.4299963,
"logger": "tls.issuance.acme.acme_client",
"msg": "cleaning up solver",
"identifier": "stage.foo.bar.com",
"challenge_type": "dns-01",
"error": "deleting temporary record for name \"foo.bar.com.\" in zone {\"\" \"TXT\" \"_acme-challenge.stage\" \"wEz6Z5Ta1vy5Z9ebcVcfyZTmptaYdfc-QtYRA_wV6Bs\" \"0s\" '\\x00' '\\x00'}: exceeded max wait time for ResourceRecordSetsChanged waiter"
}
{
"level": "error",
"ts": 1719514643.3972101,
"logger": "tls.issuance.acme.acme_client",
"msg": "cleaning up solver",
"identifier": "*.stage.foo.bar.com",
"challenge_type": "dns-01",
"error": "deleting temporary record for name \"foo.bar.com.\" in zone {\"\" \"TXT\" \"_acme-challenge.stage\" \"JvKk2qrEWpbsgvZ06rU1GKc28NKvKAxP_gwc-j1IVGA\" \"0s\" '\\x00' '\\x00'}: operation error Route 53: ChangeResourceRecordSets, https response error StatusCode: 400, RequestID: d4277a4b-bef0-423b-bfef-8e68495ea501, InvalidInput: Invalid XML ; javax.xml.stream.XMLStreamException: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 248; cvc-complex-type.2.4.b: The content of element 'ResourceRecords' is not complete. One of '{\"https://route53.amazonaws.com/doc/2013-04-01/\":ResourceRecord}' is expected."
}
I just added
us-east-1as the region value and the error went away and everything works! Just thought I'd mention that this parameter may be required now.
fwiw, the plugin can take the value from the AWS_REGION environment variable.
@kdevan The exceeded max wait time for ResourceRecordSetsChanged waiter error just means the default wait time, 1 minute, wasn't enough for the records to propagate. You could try and increase the time using max_wait_dur.
@aymanbagabas Hi! Just to clarify, we should be setting wait_for_propagation to true when working with wildcard certificates right? Thanks :)
We still get the "exceeded max wait time for ResourceRecordSetsChanged waiter" we have set wait_for_propagation to "true" and set a "max_wait_dur" to 120. Anyone else still having this issue?
The only way to get it working for me with a wildcart certificate was this:
*.mydomain.tld {
tls {
dns route53 {
region "ca-central-1"
wait_for_propagation true
}
}
}
Importantly, setting max_wait_dur to anything other than the default value was not working. And I did need to specify the region... for some reason.
For anyone also having trouble. I finally made this work by removing the "wait_for_propagation true" from the caddyfile and it worked right away.
tls { dns route53 { access_key_id "id" secret_access_key "password" region "us-east-1" }
Importantly, setting
max_wait_durto anything other than the default value was not working.
There was a bug with max_wait_dur always using nanoseconds. With v1.5.1, the value for max_wait_dur is always in seconds.
And I did need to specify the region... for some reason.
If region is not specified, it will try to load the region from $AWS_REGION as described in https://aws.github.io/aws-sdk-go-v2/docs/configuring-sdk/#specifying-the-aws-region
EDIT: I've updated the readme to indicate that defining AWS_REGION and aws credentials are required
Amazing this was unexpected! This new requirement of AWS region totally brought down my whole set of reverse proxies including my cloud when the certs needed to be updated. As soon as I saw that region error I came here.
One question is what region? Does it even matter? Do I use the one I see in the AWS console? https://us-east-1.console.aws.amazon.com/ AFAIK route53 is not related to a region so why the region anyway. So I did set mine to us-east-1. Sure am glad this was just my personal network so being down overnight was not a big issue. Not sure how one could get info on a "breaking" change like this beforehand, but sure would be nice.
Below is working for me now for wildcards. My IAM credentials are environment variables. As others mentioned some times old _acme records don't get cleaned out so I do so via the AWS console. If I feel like I need a clean slate (recreate all the certs) I delete all the caddy settings/certs and restart. At least for arch they can be found at /var/lib/caddy
tls <redcat>@gmail.com {
dns route53 {
max_retries 10
region "us-east-1"
wait_for_propagation true
}
resolvers 8.8.8.8 1.1.1.1
}
We have released a beta version fully compatible with Caddy 2.10 and the new libdns. It includes improved defaults. Give it a try and feel free to file a new issue. We've also added a note about AWS regions in README and it is optional now.
P.S. In some complex cases, multiple retries may be needed to obtain a certificate. Allow approximately 5-7 minutes.