traefik icon indicating copy to clipboard operation
traefik copied to clipboard

ACME Challenge Failure Against NS1 with image tag v2.11.0

Open JerboaGobi opened this issue 1 year ago • 1 comments

Welcome!

  • [X] Yes, I've searched similar issues on GitHub and didn't find any.
  • [X] Yes, I've searched similar issues on the Traefik community forum and didn't find any.

What did you do?

A few days ago I updated to the latest release, v2.11.0 Yesterday, after performing a revocation of the certificate, due to key compromise, I cleared the acme.json file to force Traefik to create a new private key and to issue new certificates.

What did you see instead?

The logs then detailed put requests against NS1 for the _acme-challenge TXT records would then fail with http 400 codes. I rolled back to image v.2.10.7. No other configuration file changes were made. The PUT requests succeed on v2.10.7 and certificates are issued as expected. Also, tested on v3.0 and the issue is present there as well.

What version of Traefik are you using?

Version: 2.11.0 Codename: cheddar Go version: go1.22.0 Built: 2024-02-12T15:26:45Z OS/Arch: linux/amd64

What is your environment & configuration?

traefik:
    image: traefik:v2.11.0
    command:
    - --global.checknewversion=false
    - --global.sendanonymoususage=false
    - --log=true
    - --log.level=debug

    - --accesslog=true
    - --accesslog.filepath=/etc/traefik/logs/access.log
    - --accesslog.filters.statuscodes=100-199,200-203,205-299,300-399,400-499,500-599
    - --accesslog.filters.retryattempts
    - --accesslog.filters.minduration=10ms

    - --entrypoints.http.address=:80
    - --entrypoints.https.address=:443

    - --entrypoints.http.http.redirections.entryPoint.to=https

    - --entryPoints.http.transport.lifeCycle.requestAcceptGraceTimeout=30
    - --entryPoints.https.transport.lifeCycle.requestAcceptGraceTimeout=30

    - --providers.docker.endpoint=tcp://172.129.30.6:2375
    - --providers.docker.exposedbydefault=false
    - --providers.docker.watch=true
    - --providers.docker.constraints=Label(`traefik-internal.instance.enable`,`true`)
    - --providers.file.directory=/etc/traefik/rules
    - --providers.file.watch=true
    - --api=true

    - --certificatesresolvers.letsencrypt.acme.email=${CF_ACME_EMAIL}
    - --certificatesresolvers.letsencrypt.acme.storage=/etc/traefik/acme/acme.json
    - --certificatesresolvers.letsencrypt.acme.dnschallenge=true
    - --certificatesresolvers.letsencrypt.acme.dnschallenge.provider=ns1
    - --certificatesresolvers.letsencrypt.acme.dnschallenge.delaybeforecheck=60
    - --certificatesresolvers.letsencrypt.acme.dnschallenge.resolvers=172.64.36.1:53,172.64.36.2:53
  labels:
    - traefik-internal.instance.enable=true
    - traefik.enable=true
	
    - traefik.http.routers.traefik-internal.entrypoints=https
    - traefik.http.routers.traefik-internal.rule=Host(`${HOST_NAME}`)
    - traefik.http.routers.traefik-internal.tls=true
    - traefik.http.routers.traefik-internal.service=api@internal

    - traefik.http.routers.traefik-internal.tls.certresolver=letsencrypt
    - traefik.http.routers.traefik-internal.tls.domains[0].main=internal.redacted.com
    - traefik.http.routers.traefik-internal.tls.domains[0].sans=internal.redacted.com, *.internal.redacted.com


    - traefik.tls.stores.default.defaultgeneratedcert.resolver=letsencrypt
    - traefik.tls.stores.default.defaultgeneratedcert.domain.main=internal.redacted.com
    - traefik.tls.stores.default.defaultgeneratedcert.domain.sans=internal.redacted.com, *.internal.redacted.com
	
    - traefik.http.services.traefik-internal.loadbalancer.server.port=1337

If applicable, please paste the log output in DEBUG level

time="2024-02-16T03:44:38Z" level=error msg="Unable to obtain ACME certificate for domain \"*.internal.redacted.com,internal.redacted.com,internal.redacted.com\"" error="unable to generate a certificate for the domains [*.internal.redacted.com internal.redacted.com internal.redacted.com]: error: one or more domains had a problem:\n[*.internal.redacted.com] [*.internal.redacted.com] acme: error presenting token: ns1: failed to create record [zone: \"internal.redacted.com\", fqdn: \"_acme-challenge.internal.redacted.com.\"]: PUT https://api.nsone.net/v1/zones/internal.redacted.com/_acme-challenge.internal.redacted.com/TXT: 400 Input validation failed (Value None for field '<obj>.tags' is not of type object)\n[internal.redacted.com] [internal.redacted.com] acme: error presenting token: ns1: failed to create record [zone: \"internal.redacted.com\", fqdn: \"_acme-challenge.internal.redacted.com.\"]: PUT https://api.nsone.net/v1/zones/internal.redacted.com/_acme-challenge.internal.redacted.com/TXT: 400 Input validation failed (Value None for field '<obj>.tags' is not of type object)\n" ACME CA="https://acme-staging-v02.api.letsencrypt.org/directory" tlsStoreName=default providerName=letsencrypt.acme

JerboaGobi avatar Feb 16 '24 15:02 JerboaGobi

Hello,

It's related to a breaking change introduced by NS1: https://github.com/ns1/ns1-go/pull/220 This was introduced inside a bugfix release of their API client, which is not semver compliant and without any doc related to this change.

I will fix the problem inside lego and then update lego inside Traefik.

ldez avatar Feb 16 '24 17:02 ldez

I found this issue, I also had acme issues with 2.11.0 Namely it's not starting to renew expired certificates. No logs either on traefik or on the acme server (StepCA). Thought sharing here. Reverted back to 2.10.7, restarted container and renewal started immediately.

sooslaca avatar Feb 29 '24 08:02 sooslaca

Closed by #10508.

traefiker avatar Mar 11 '24 08:03 traefiker

we are still having this issue with version 2.11.0. anyone else?

stickeraugust avatar Mar 22 '24 13:03 stickeraugust

The fix has been merged after v2.11.0, it will be available inside v2.11.1.

ldez avatar Mar 22 '24 14:03 ldez