cli icon indicating copy to clipboard operation
cli copied to clipboard

[Bug]: renew not working with reverse-proxy and mtls=false

Open juju4 opened this issue 11 months ago • 18 comments

Steps to Reproduce

I setup a certificate server (step 0.28.2 on ubuntu 24.04.1) and a nginx reverse proxy. Issuing certificate works fine from step service or nginx. But renewal does not work when using nginx ca-url (port 443). It works if accessing directly step service (port 8443). "mtls=false" was used. "--mtls false" as described in docs did not work with step-cli for me. It does not seem there is any debug/verbose option to check where getting invalid character which is probably the start of a html page.

From cert-renewer systemd unit

ExecStart=/usr/bin/step ca renew --ca-url=https://certs.internal --root=/usr/share/ca-certificates/stepca-internal-roots.pem --mtls false --force ${CERT_LOCATION} ${KEY_LOCATION} (code=exited, status=1/FAILURE)
# results in
Jan 08 21:50:43 myhost.internal step[3935]: too many positional arguments were provided in 'step ca renew <crt-file> <key-file>'

Manual testing

root@myhost:~# /usr/bin/step ca renew --ca-url=https://certs.internal --root=/usr/share/ca-certificates/stepca-internal-roots.pem --mtls false /etc/ssl/certs/myhost.crt /etc/ssl/private/myhost.key
too many positional arguments were provided in 'step ca renew <crt-file> <key-file>'
root@myhost:~# /usr/bin/step ca renew --ca-url=https://certs.internal --root=/usr/share/ca-certificates/stepca-internal-roots.pem /etc/ssl/certs/myhost.crt /etc/ssl/private/myhost.key
error renewing certificate: failed decoding CA error response: invalid character '<' looking for beginning of value
root@myhost:~# /usr/bin/step ca renew --mtls false --ca-url=https://certs.internal --root=/usr/share/ca-certificates/stepca-internal-roots.pem /etc/ssl/certs/myhost.crt /etc/ssl/private/myhost.key
too many positional arguments were provided in 'step ca renew <crt-file> <key-file>'
root@myhost:~# /usr/bin/step ca renew --ca-url=https://certs.internal --root=/usr/share/ca-certificates/stepca-internal-roots.pem /etc/ssl/certs/myhost.crt /etc/ssl/private/myhost.key
error renewing certificate: failed decoding CA error response: invalid character '<' looking for beginning of value

Thanks

Your Environment

  • OS - Ubuntu 24.04.1
  • step CLI Version - 0.28.2

Expected Behavior

Renewal to work

Actual Behavior

Renewal fails

Additional Context

No response

Contributing

Vote on this issue by adding a 👍 reaction. To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).

juju4 avatar Jan 12 '25 22:01 juju4

Hey @juju4,

Can you try --mtls=false in your manual testing (and cert-renewer systemd unit)? I believe the --mtls false gets interpreted differently than you expect.

As for the < character: it's possible that your proxy is serving an error page when trying to upstream to the CA server. I suppose because those invocations don't have --mtls=false, they actually are terminated by the mTLS endpoint, and a TLS error is returned. So I think if you provide --mtls=false, you won't get that error message. Alternatively, you could try inspecting the HTML in a browser.

hslatman avatar Jan 13 '25 11:01 hslatman

One location where --mtls false was being mentioned was updated in this PR: https://github.com/smallstep/docs/pull/374.

hslatman avatar Jan 14 '25 21:01 hslatman

I did use mtls=false too. forgot to include in the list and it returns same error "error renewing certificate: failed decoding CA error response: invalid character '<' looking for beginning of value".

I would want to debug with curl as /renew needs a POST but I don't know the syntax needed. Browser /renew returns blank page with HTTP 405 status code. while on above server logs a 400 status code.

# curl -X POST https://certs.internal/renew
{"status":400,"message":"The request could not be completed: missing client certificate."}
# curl -X POST https://certs.internal/renew -d @/etc/ssl/certs/myhost.crt 
{"status":400,"message":"The request could not be completed: missing client certificate."}

juju4 avatar Jan 19 '25 21:01 juju4

Can you try it with GODEBUG=http2debug=2 step ca renew ...? This will output the HTTP communication, and should let you inspect the response.

It's possible to use curl, but you would need to obtain a token first, and that'll involve a few more steps.

hslatman avatar Jan 20 '25 23:01 hslatman

# journalctl -u cert-renewer@nginx -l --since yesterday
[...]
Jan 24 13:32:18 MYHOST step[50147]: certificate does not need renewal
Jan 24 13:32:18 MYHOST systemd[1]: [email protected]: Skipped due to 'exec-condition'.
Jan 24 13:32:18 MYHOST systemd[1]: Condition check resulted in [email protected] - Certificate renewer for nginx being skipped.
Jan 24 13:49:10 MYHOST systemd[1]: Starting [email protected] - Certificate renewer for nginx...
Jan 24 13:49:10 MYHOST step[58981]: failed decoding CA error response: invalid character '<' looking for beginning of value
Jan 24 13:49:10 MYHOST step[58981]: error renewing certificate
Jan 24 13:49:10 MYHOST step[58981]: github.com/smallstep/cli/command/ca.(*renewer).Renew
Jan 24 13:49:10 MYHOST step[58981]:         github.com/smallstep/cli/command/ca/renew.go:474
Jan 24 13:49:10 MYHOST step[58981]: github.com/smallstep/cli/command/ca.renewCertificateAction
Jan 24 13:49:10 MYHOST step[58981]:         github.com/smallstep/cli/command/ca/renew.go:331
Jan 24 13:49:10 MYHOST step[58981]: github.com/smallstep/cli/command/ca.renewCertificateCommand.ActionFunc.func1
Jan 24 13:49:10 MYHOST step[58981]:         github.com/smallstep/[email protected]/command/command.go:38
Jan 24 13:49:10 MYHOST step[58981]: github.com/urfave/cli.HandleAction
Jan 24 13:49:10 MYHOST step[58981]:         github.com/urfave/[email protected]/app.go:522
Jan 24 13:49:10 MYHOST step[58981]: github.com/urfave/cli.Command.Run
Jan 24 13:49:10 MYHOST step[58981]:         github.com/urfave/[email protected]/command.go:175
Jan 24 13:49:10 MYHOST step[58981]: github.com/urfave/cli.(*App).RunAsSubcommand
Jan 24 13:49:10 MYHOST step[58981]:         github.com/urfave/[email protected]/app.go:405
Jan 24 13:49:10 MYHOST step[58981]: github.com/urfave/cli.Command.startApp
Jan 24 13:49:10 MYHOST step[58981]:         github.com/urfave/[email protected]/command.go:380
Jan 24 13:49:10 MYHOST step[58981]: github.com/urfave/cli.Command.Run
Jan 24 13:49:10 MYHOST step[58981]:         github.com/urfave/[email protected]/command.go:103
Jan 24 13:49:10 MYHOST step[58981]: github.com/urfave/cli.(*App).Run
Jan 24 13:49:10 MYHOST step[58981]:         github.com/urfave/[email protected]/app.go:277
Jan 24 13:49:10 MYHOST step[58981]: main.main
Jan 24 13:49:10 MYHOST step[58981]:         ./main.go:73
Jan 24 13:49:10 MYHOST step[58981]: runtime.main
Jan 24 13:49:10 MYHOST step[58981]:         runtime/proc.go:272
Jan 24 13:49:10 MYHOST step[58981]: runtime.goexit
Jan 24 13:49:10 MYHOST step[58981]:         runtime/asm_amd64.s:1700
Jan 24 13:49:10 MYHOST systemd[1]: [email protected]: Main process exited, code=exited, status=1/FAILURE
Jan 24 13:49:10 MYHOST systemd[1]: [email protected]: Failed with result 'exit-code'.
Jan 24 13:49:10 MYHOST systemd[1]: Failed to start [email protected] - Certificate renewer for nginx.
Jan 24 14:03:05 MYHOST systemd[1]: Starting [email protected] - Certificate renewer for nginx...
Jan 24 14:03:06 MYHOST step[3035]: failed decoding CA error response: invalid character '<' looking for beginning of value
Jan 24 14:03:06 MYHOST step[3035]: error renewing certificate
Jan 24 14:03:06 MYHOST step[3035]: github.com/smallstep/cli/command/ca.(*renewer).Renew
[...]
# cat /etc/systemd/system/cert-renewer\@nginx.service.d/override.conf 
[Service]
; `Environment=` overrides are applied per environment variable. This line does not
; affect any other variables set in the service template.
Environment=CERT_LOCATION=/etc/ssl/certs/MYHOST.crt \
            KEY_LOCATION=/etc/ssl/private/MYHOST.key \
            STEPDEBUG=1 \
            GODEBUG=http2debug=2

WorkingDirectory=/etc/ssl

; Restart service after the certificate is successfully renewed.
ExecStartPost=/usr/bin/systemctl restart nginx.service

even with the extra trace, I don't see where error comes from.

juju4 avatar Jan 26 '25 19:01 juju4

Can you try that with the manual invocation instead?

hslatman avatar Jan 26 '25 22:01 hslatman

same

# export CERT_LOCATION=/etc/ssl/certs/MYHOST.internal.crt KEY_LOCATION=/etc/ssl/private/MYHOST.internal.key STEPDEBUG=1 GODEBUG=http2debug=2
# /usr/bin/step ca renew --ca-url=https://certs.internal --root=/usr/share/ca-certificates/stepca-internal-roots.pem --mtls=false --force  ${CERT_LOCATION} ${KEY_LOCATION}
failed decoding CA error response: invalid character '<' looking for beginning of value
error renewing certificate
github.com/smallstep/cli/command/ca.(*renewer).Renew
	github.com/smallstep/cli/command/ca/renew.go:474
github.com/smallstep/cli/command/ca.renewCertificateAction
	github.com/smallstep/cli/command/ca/renew.go:331
github.com/smallstep/cli/command/ca.renewCertificateCommand.ActionFunc.func1
	github.com/smallstep/[email protected]/command/command.go:38
github.com/urfave/cli.HandleAction
	github.com/urfave/[email protected]/app.go:522
github.com/urfave/cli.Command.Run
	github.com/urfave/[email protected]/command.go:175
github.com/urfave/cli.(*App).RunAsSubcommand
	github.com/urfave/[email protected]/app.go:405
github.com/urfave/cli.Command.startApp
	github.com/urfave/[email protected]/command.go:380
github.com/urfave/cli.Command.Run
	github.com/urfave/[email protected]/command.go:103
github.com/urfave/cli.(*App).Run
	github.com/urfave/[email protected]/app.go:277
main.main
	./main.go:73
runtime.main
	runtime/proc.go:272
runtime.goexit
	runtime/asm_amd64.s:1700

juju4 avatar Jan 26 '25 22:01 juju4

Interesting. Seeing the same behavior, specifically, no HTTP debug output using GODEBUG. Haven't found out why yet, though. Maybe it doesn't try HTTP2 at all.

I think for the test you could maybe just try a POST request to /renew using curl.

hslatman avatar Jan 26 '25 23:01 hslatman

# curl -X POST https://certs.internal/renew
{"status":400,"message":"The request could not be completed: missing client certificate."}
# curl -X POST https://certs.internal/renew -d @/etc/ssl/certs/myhost.crt 
{"status":400,"message":"The request could not be completed: missing client certificate."}

how different from those?

juju4 avatar Jan 26 '25 23:01 juju4

Try it with curl -H "Authorization: Bearer invalid" -X POST https://certs.internal/renew

hslatman avatar Jan 26 '25 23:01 hslatman

# curl -H "Authorization: Bearer invalid" -X POST https://certs.internal/renew
{"status":401,"message":"error validating renew token"}

juju4 avatar Jan 27 '25 02:01 juju4

any other idea? works fine in direct access :(

juju4 avatar Feb 09 '25 20:02 juju4

Does your nginx instance trust the CA root certificate?

hslatman avatar Feb 10 '25 11:02 hslatman

root ca has been added to ca-certificates, both server and clients. or do you mean something else?

juju4 avatar Feb 16 '25 19:02 juju4

I tried to add some extra verbosity in func Renew before errors.Wrap (https://github.com/smallstep/cli/blob/master/command/ca/renew.go#L483) but resp is nil so nothing outside of what is already returned in err. On nginx CA server side, I only get "POST /renew HTTP/1.1" 400 It does not seem to reach step-ca server as no matching logs on its side. I tried to put nginx in the most basic config while keeping https but with no change.

juju4 avatar Mar 16 '25 21:03 juju4

I tried to add some extra verbosity in func Renew before errors.Wrap (https://github.com/smallstep/cli/blob/master/command/ca/renew.go#L483) but resp is nil so nothing outside of what is already returned in err. On nginx CA server side, I only get "POST /renew HTTP/1.1" 400 It does not seem to reach step-ca server as no matching logs on its side. I tried to put nginx in the most basic config while keeping https but with no change.

If the CA does not log anything from Nginx, then I suspect it's a trust issue between Nginx and step-ca. Have you tried the proxy_ssl_trusted_certificate directive pointing to the root of your CA? We have few notes on running step-ca behind a proxy here: https://smallstep.com/docs/step-ca/certificate-authority-server-production/#proxying-step-ca-traffic.

hslatman avatar Mar 17 '25 10:03 hslatman

I read this page many times and did again . my nginx conf is https://github.com/juju4/ansible-role-smallstep-ca/blob/master/templates/nginx.conf.j2#L31 I tried proxy_ssl_trusted_certificate in past and now but does not seem to change anything. nginx cert is from step/ca and always direct (else chicken and egg problem) Not tried stream module. I will but not a really a setup I would want as no logs.

juju4 avatar Mar 30 '25 21:03 juju4

I'm getting the same thing as reported in this comment using Smallstep CLI/0.28.7 (linux/amd64)

The TL;DR was that the certificate I was trying to renew was issued by Lets Encrypt. Once I moved that out of the way and issued a new cert from my CA server, I was able to renew properly. I hope these details help others who might find themselves in a similar situation, and may contribute to an improved error message or documentation.

/usr/bin/step ca renew --force ${CERT_LOCATION} ${KEY_LOCATION} yeilds the "The request could not be completed: missing client certificate." error.

/usr/bin/step ca renew --force --mtls=false ${CERT_LOCATION} ${KEY_LOCATION} yields "error validating renew token"

curl -H "Authorization: Bearer invalid" -X POST https://mypkiserver/renew yields the same error: {"status":401,"message":"error validating renew token"}

Here's the stacktrace:

github.com/smallstep/cli/command/ca.(*renewer).Renew
        github.com/smallstep/cli/command/ca/renew.go:484
github.com/smallstep/cli/command/ca.renewCertificateAction
        github.com/smallstep/cli/command/ca/renew.go:341
github.com/smallstep/cli/command/ca.renewCertificateCommand.ActionFunc.func1
        github.com/smallstep/[email protected]/command/command.go:38
github.com/urfave/cli.HandleAction
        github.com/urfave/[email protected]/app.go:522
github.com/urfave/cli.Command.Run
        github.com/urfave/[email protected]/command.go:175
github.com/urfave/cli.(*App).RunAsSubcommand
        github.com/urfave/[email protected]/app.go:405
github.com/urfave/cli.Command.startApp
        github.com/urfave/[email protected]/command.go:380
github.com/urfave/cli.Command.Run
        github.com/urfave/[email protected]/command.go:103
github.com/urfave/cli.(*App).Run
        github.com/urfave/[email protected]/app.go:277
github.com/smallstep/cli/internal/cmd.run
        github.com/smallstep/cli/internal/cmd/root.go:62
github.com/smallstep/cli/internal/cmd.Run
        github.com/smallstep/cli/internal/cmd/root.go:47
main.main
        github.com/smallstep/cli/cmd/step/main.go:28
runtime.main
        runtime/proc.go:283
runtime.goexit  
        runtime/asm_amd64.s:1700

The root CA is installed on the client who is trying to renew their cert. I'm able to confirm that this is working properly by running curl and confirming that it doesn't throw any certificate errors.

I believe this is an issue on the client side because all my other client seem to be able to continue to renew their certs (either that, or I'm going to get an absolute avalanche of email about failed certificates soon... 😬

What clued me in to the error was the bizarre "missing client certificate" error. This is a TLS certificate, so I would expect that it's going to use the acme cert renew process, which should only require that the client can respond to a challenge sent by the pki server. (I later realized that's only for issuing the cert, not for renewing the cert) This error message would make sense if I were renewing an SSH user cert, and I didn't have the required x509 certificate, but that's not what I'm trying to do here.

This curiosity led me to take a look at the existing cert with openssl x509 -text -noout -in ... and I found that a certificate from LetsEncrypt (LE) got copied over there. So I was trying to renew the cert issued by LE, and I thought, gee that might be the problem? So I moved that out out of the way, manually ran step ca certificate $(hostname -f) ${CERT_LOCATION} ${KEY_LOCATION} --issuer=acme -f --not-after=8760h --kty=EC --curve=P-384 and then I found my initial renewal command worked properly.

The root cause of this problem for me was that I had a script that was periodically copying over a cert another machine was pulling from LE to this internal server. I'm now moving to using certs issued by my private CA for internal services, and I just forget to disable the old script. So that's how this happened to me, and why I'm confident that it won't happen again.

hax0rbana-adam avatar Oct 11 '25 18:10 hax0rbana-adam