ingress-nginx
ingress-nginx copied to clipboard
ACME challenge fails with `nginx.ingress.kubernetes.io/permanent-redirect`
What happened:
Error: Error accepting authorization: acme: authorization error for our-domain.ch: 403 urn:ietf:params:acme:error:unauthorized: During secondary validation: <ip>: Invalid response from http://our-domain.ch/.well-known/acme-challenge/NoqmY7zPrO4y7EQB9r6gmlgQUMXFJg9GNtum9FgYn8s: 403
What you expected to happen:
I would expect the ingress to configure nginx in such a way that it doesn't redirect on accessing http://our-domain.ch/.well-known/acme-challenge/NoqmY7zPrO4y7EQB9r6gmlgQUMXFJg9GNtum9FgYn8s
, but this case is handled specially, so that the acme challenge is successful...
This http://our-domain.ch/.well-known/acme-challenge/NoqmY7zPrO4y7EQB9r6gmlgQUMXFJg9GNtum9FgYn8s
redirects to https://foo.com
after the failed acme challenge, but likely also during the acme challenge.
As we are on a managed kubernetes cluster, I can not take a look at the created nginx config, or the cluster-issuer
configuration, but I strongly suggest this problem lays in the wrong handling of the permanent-redirect
by the ingress nginx, as other certificates in our cluster are created without problem.
NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.):
helm.sh/chart=ingress-nginx-4.9.1
Kubernetes version (use kubectl version
):
WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short. Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.3", GitCommit:"25b4e43193bcda6c7328a6d147b1fb73a33f1598", GitTreeState:"clean", BuildDate:"2023-06-14T09:53:42Z", GoVersion:"go1.20.5", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v5.0.1
Server Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.15+rke2r1", GitCommit:"da6089da4974a0a180c226c9353e1921fa3c248a", GitTreeState:"clean", BuildDate:"2023-10-18T16:31:24Z", GoVersion:"go1.20.10 X:boringcrypto", Compiler:"gc", Platform:"linux/amd64"}
WARNING: version difference between client (1.27) and server (1.25) exceeds the supported minor version skew of +/-1
Environment:
-
Cloud provider or hardware configuration: not sure if I can give that
-
OS (e.g. from /etc/os-release): my local system
-
Kernel (e.g.
uname -a
): 22.04.4 LTS (Jammy Jellyfish) -
Install tools: managed, don't know
-
Basic cluster related info:
-
kubectl version
: See above -
kubectl get nodes -o wide
: Can not give that because of IPs.
-
-
How was the ingress-nginx-controller installed: Probably helm, don't know.
-
Current State of the controller: Not sure, if I'm allowed to share this info.
-
Current state of ingress object, if applicable: Not sure, if I'm allowed to share this info.
-
Others:
kubectl --namespace=rest-ingress describe challenge
Name: ingress-cert-ppzsw-363926471-2456233727
Namespace: rest-ingress
Labels: <none>
Annotations: <none>
API Version: acme.cert-manager.io/v1
Kind: Challenge
Metadata:
Creation Timestamp: 2024-04-25T16:29:12Z
Finalizers:
finalizer.acme.cert-manager.io
Generation: 1
Owner References:
API Version: acme.cert-manager.io/v1
Block Owner Deletion: true
Controller: true
Kind: Order
Name: ingress-cert-ppzsw-363926471
UID: 537efb1f-9b4d-4de8-be46-6e278a6f0a82
Resource Version: 345155008
UID: ed3f5cf8-e2f8-49d7-a902-a3b2f9d38a64
Spec:
Authorization URL: https://acme-v02.api.letsencrypt.org/acme/authz-v3/342936176147
Dns Name: our-domain.ch
Issuer Ref:
Group: cert-manager.io
Kind: ClusterIssuer
Name: letsencrypt-prod
Key: QHVv0l6NXBnzHCNTqmkaI9OyYJ3gIQyEXMmyjGE8U9M.Y4nXKJHjF4S08EPIIlBf0KSxnwzNB9ws_Oa1FD-00nI
Solver:
http01:
Ingress:
Pod Template:
Metadata:
Spec:
Tolerations:
Operator: Exists
Token: QHVv0l6NXBnzHCNTqmkaI9OyYJ3gIQyEXMmyjGE8U9M
Type: HTTP-01
URL: https://acme-v02.api.letsencrypt.org/acme/chall-v3/342936176147/oc5jsA
Wildcard: false
Status:
Presented: false
Processing: false
Reason: Error accepting authorization: acme: authorization error for ourdomain.ch: 403 urn:ietf:params:acme:error:unauthorized: During secondary validation: 85.131.147.68: Invalid response from http://our-domain.ch /.well-known/acme-challenge/QHVv0l6NXBnzHCNTqmkaI9OyYJ3gIQyEXMmyjGE8U9M: 403
State: invalid
Events: <none>
How to reproduce this issue:
Apply a minimal ingress like this:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: rest-ingress
namespace: our-rest-ingress
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
acme.cert-manager.io/http01-edit-in-place: "false" # Error with or without
cert-manager.io/issue-temporary-certificate: "true" # Error with or without
nginx.ingress.kubernetes.io/permanent-redirect: https://foo.com
spec:
ingressClassName: nginx
tls:
- hosts:
- our-domain.ch
secretName: ingress-cert
rules:
- host: our-domain.ch
Is there something wrong with the configuration?
This issue is currently awaiting triage.
If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted
label and provide further guidance.
The triage/accepted
label can be added by org members by writing /triage accepted
in a comment.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
- cert-manager works for tons of users already so this requires more detail
- the template of a new bug report asks questions that collect info for readers to look at data to base comments on
- look at the template and answer the questions in md format
/remove-kind bug /kind support /triage needs-information
cert-manager works for tons of users already so this requires more detail
The question is still if the redirect is expected behavior and as others said in the linked issue, they expect different behavior, as well as I do, so I just think it's a bug.
the template of a new bug report asks questions that collect info for readers to look at data to base comments on
Yeah, sry, force of habit to empty the template on issue creation. I added all info I "could", meaning some details especially regarding setup I con not provide, because we are managed, also not sure if I can just expose or IPs.
I think the issue is general and not a specific version bug, but again could be layer 8 here.
look at the template and answer the questions in md format
Done that.
@longwuyuan Can you or some of your team just try the minimal configuration?
I think it could be fastly discovered that the given behaviour exist in newer kubernetes/ingress configuration...
Thanks for the info. Your thought process is a bit clearer now, but in my opinion, its not super helpful for all readers. Just my opinion and others will opine differently. Basically the info that helps readers is ;
- k get all -n ingress-nginx
- k -n ingress-nginx describe po $ingress-nginx-controller
- k -n ingress-nginx logs $ingress-nginx-controller-podname
- k -n $appnamespace get svc,ing
- k -n $appnamespace describe ing $ingname
- Your complete curl command as used for real with a -v
- Other such info
Redacted info also works as there is some insight on the state of the resources and the log of the events. But since its redirection issue, it helps if you cognitively replace FQDNs as appropriate with real behaviour, so context is retained without the real-hostname used.
I am assuming that you want to use permanent-redirect
annotation, but you want it to NOT do a blanket-redirection of all the requests, that match that specific ingress rule. You want that annotation to do selective intelligent redirection by NOT-redirecting acme traffic, because you want the cert, but you want it to redirect everything else. Is this a fair view of what you expect. And you suspect this is a bug ?
@longwuyuan Thanks for you feedback
You want that annotation to do selective intelligent redirection by NOT-redirecting acme traffic, because you want the cert, but you want it to redirect everything else. Is this a fair view of what you expect. And you suspect this is a bug ?
Exactly, this was already requested in https://github.com/kubernetes/ingress-nginx/issues/6853:
Most likely, this could be solved by ensuring that a redirect is not added if the location starts with ".well-known/acme-challenge".
A workaround was suggested by @danton721
I have done it a bit differently if it is still valid for discussion:
nginx.ingress.kubernetes.io/server-snippet: |
if ($request_uri !~* ^/.well-known) {
return 301 $scheme://my.site.com$request_uri;
}
AFAIK, without this, for HTTPS, there can not be created a valid certificate, so browsers will not redirect traffic when accessing https://our-domain.ch
, but complain about the missing certificate.
I would consider this a "bug" if the user has to use the .well-known
workaround, at least it should be documented for nginx.ingress.kubernetes.io/permanent-redirect
, that is doesn't work for HTTPS.
ok. that info is helpful.
Can you explain this paradox that first you create a ingress with a specific hostname. And then instead of routing traffic to a backend pod behind that FQDN, you want to do seemingly contradictory things simultaneously on that rule being matched. Why generate a certificate for a FQDN when you are going to permanently-redirect from there with a 30X response.
you want to do seemingly contradictory things simultaneously on that rule being matched.
That is a valid question (I will try a solution by DNS later on, but I think this will not work):
- We deploy tenant websites like
<tenant>.our-domain.ch
in a custom namespace for each tenant, within such a tenant namespaces there is an ingress routing requests to<tenant>.our-domain.ch
to the backend service of that tenant. For each tenant ingress, we create a certificate using the cert manager. - For the "main" domain
our-domain.ch
we want to redirect the traffic to our company homepage (the target of the permanent redirect)https://foo.com
, as currently accessingour-domain.ch
without a tenant gives404
. This works when accessinghttp://our-domain.ch
, but when accessinghttps://our-domain.ch
the browser is complaining about missing certificate, as we only have auto certificates for each subdomain, not a single wildcard certificate for all subdomains.
Does that clarify our case?
As long as there is not valid certificate for the "main" domain our-domain.ch
, we can not redirect traffic from it, when accessing https://our-domain.ch
.
But if you ask yourself why we want this, shouldn't you ask yourself why the nginx.ingress.kubernetes.io/permanent-redirect
annotation exists in the first place?
It seems like you question the very existence of the annotation you maintain...
Thanks for the information. Its very helpful in understanding your use case.
-
The target of asking you questions is to ascertain actionable items in short and long-term
-
I think there is some fairness to expect that permanent-redirect annotation should redirect selectively so that it can integrate with cert-manager
-
But annotations in the eingress-nginx controller are implementations around ingress and hence HTTP & HTTPS protocols
-
So annotations in the controller can not practically integrate with every single HTTP/HTTPS related software out there. It is practically impossible to implement and maintain such integration
-
Also the general approach is dedicated focus on different parts of the ecosystem so its less efficient that one software does everything
Please wait for comments from otners. Maybe someone has a practical solution to selectively do redirection
I think there is some fairness to expect that permanent-redirect annotation should redirect selectively so that it can integrate with cert-manager
Thanks for acknowledgement.
can not practically integrate with every single HTTP/HTTPS related software out there.
Fair point, I wouldn't expect this to have a Prio...
@longwuyuan Can we maybe hint that nginx.ingress.kubernetes.io/permanent-redirect
doesn't work out of the box with HTTPS in the docs?
The summary is ;
-
upstream nginx offers returning code 301 and that is the function of this annotation
-
some people can potentially use nginx
if
and do selective redirection but that translates to a snippet in the controller -
this controller does not integrate with another popular & critical tool called external-dnsfor another feature i.e. internal+external LB together in one install. The controller fails to return the internal-LB FQDN to external-dns. this is another example of the difficulty in external tool integrations
-
selective redirection while responding with a 301 is corner-case use not widely demanded by multitude of users and there is acute shortage of developer-time here so the likely solution in future is the gateway-api
-
if the cert-manager team wants to put a PR to NOT route the ACME url for this annotation, the PR will get reviewed but some user has to take up the initiative to talk to cert-manager team and follow it up etc
-
Your suggestion for docs improvement is fair if the content is informative about the use-case and does not imply that selective redirection is a standard that this annotation is deviating from. Please submit PR
-
Such integrations with other tools are desirable & most welcome as PRs but the effort to develop it to maturity and then testing/maintenance requires massive resources that are just not available in the project
/remove-triage needs-information
@renepupil Disabling http01-edit-in-place
(default) should work, because cert-manager will generate a new ingress with a more specific path instead of editing your existing ingress. This more specific path takes precedence and does not contain the redirect annotation.
You said
This http://our-domain.ch/.well-known/acme-challenge/NoqmY7zPrO4y7EQB9r6gmlgQUMXFJg9GNtum9FgYn8s redirects to https://foo.com after the failed acme challenge, but likely also during the acme challenge.
This is not a fair assumption because cert-manager may have removed the ingress by the time you tested this after the failed acme challenge. If cert-manager has already removed the acme challenge ingress due to failures, only the redirect ingress remains so of course the acme challenge url would redirect to foo.com.
It's interesting that the error mentions 403
as the response code, while entirely possible that your corporate website returns Forbidden for .well-known
I would rather expect 404
instead.
Have you confirmed that the acme challenge actually hits your corporate website?
Have you confirmed that the acme challenge actually hits your corporate website?
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: redirect-to-another
namespace: our-redirects
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
acme.cert-manager.io/http01-ingress-class: nginx
acme.cert-manager.io/http01-edit-in-place: "false"
nginx.ingress.kubernetes.io/permanent-redirect: https://another-domain.ch
nginx.ingress.kubernetes.io/from-to-www-redirect: "false"
spec:
ingressClassName: nginx
tls:
- hosts:
- our-domain.ch
secretName: ingress-cert
rules:
- host: our-domain.ch
So, we found out that our WAF was blocking the acme challenge cause Let's encrypt uses google services that we now allow. The ingress has an event, that the certificate could get created successfully:
Normal CreateCertificate 21s cert-manager-ingress-shim Successfully created Certificate "ingress-cert"
But when calling https://our-domain.ch
in Chrome, I still get NET::ERR_CERT_COMMON_NAME_INVALID
. (Redirect works from non https page http://our-domain.ch
.
Subject: *.our-domain.ch Issuer: R3 Expires on: Jul 8, 2024 Current date: May 8, 2024
Maybe it's the wildcard in the certificate, but I would have expected the certificate creation to create a certificate that works without subdomain...
@rouke-broersma Any idea what the problem could be?