cert-manager
cert-manager copied to clipboard
Allowing skipping HTTP01 and DNS01 self-check on a per-solver basis
Is your feature request related to a problem? Please describe. Kinda intercects with #863, in nat nets cant successfully self validate acme rules, because of local k8s providers, which refuses to create hairpin nat
Describe the solution you'd like Nice env variable or cmd-flag for skip local self check and leave it up to the user
Describe alternatives you've considered Nothing
Environment details (if applicable):
- Kubernetes version (Any):
- Cloud-provider/provisioner (Cheap local providers):
- Install method (Helm):
/kind feature
This has been discussed before and we've avoided allowing it as we need some way to ensure that the challenge has propagated.
For DNS01, options like --dns01-recursive-nameservers
and --dns01-recursive-nameservers-only
help users that have DNS restricted environments that use DNS01.
I wonder if we can provide some other means to allow you to complete the self check without disabling it altogether? i.e. by overriding the server that we query for challenges?
/priority awaiting-more-evidence /help
@munnerz: This request has been marked as needing help from a contributor.
Please ensure the request meets the requirements listed here.
If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help
command.
In response to this:
This has been discussed before and we've avoided allowing it as we need some way to ensure that the challenge has propagated.
For DNS01, options like
--dns01-recursive-nameservers
and--dns01-recursive-nameservers-only
help users that have DNS restricted environments that use DNS01.I wonder if we can provide some other means to allow you to complete the self check without disabling it altogether? i.e. by overriding the server that we query for challenges?
/priority awaiting-more-evidence /help
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
Yeah, custom server for queries definitely make sense, in dns01 flags you did perfect job, but, i think this can be more confusing in http01. For me, its ok to create flag looks like --http01-external-address=10.0.0.10:80
or something like that for sure. Where user can set alternate service for proxying request to the k8s's public ip
But in that scenario user can use some kind of local created service with configured endpoints to the challenges, which de-facto like disabling at all.
For me, though, this behaviour perfectly fine, yes
I'd would like to disable the self-check, too: we have a k8s cluster with different inbound gateways and NAT and we can't hairpin the external DNS name to the correct internal IP in every scenario for every domain name, so internal checks against the external IP will timeout while requests from certbot's servers can read the challenge without problem.
We are also having this issue, while the http01 self check could be bypassed with hairpin nat, or an external split horizon DNS. in some cases this can be a real pain (Such as bootstrapping a system, see Rancher 2.0 HA install where you need an external LB)
I have the same issue : I cannot use cert-manager because of self-check tests. My router does not support hairpinning
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to jetstack.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close
.
Send feedback to jetstack.
/lifecycle rotten
/remove-lifecycle stale
this would be a great feature. we are experiencing self check failures due to our DNS policy of only allowing internal DNS servers for internal lookups. the self check is the only thing preventing the challenge from completing.
/remove-lifecycle rotten
+1
+1
+1
Given we now better handle backing off when an Order fails, I think we could consider adding this as an option on the ACME solver.
Logically, it seems it'd make sense to make this an option that applies to both DNS01 and HTTP01 solvers.
If someone wants to give implementing this a go, please drop a comment here first so we can firm up the design details 😄
/cc @JoshVanL
/area acme /area api
In our case, the node running Nginx Ingress controller somehow is able to visit the HTTP01 endpoint.
So we used podAffinity to schedule cert-manager on the same node and it solves the issue.
Any progress on this? We are running kubernetes in a cloud provider that does not support hairpinning. Without this feature we couldn't deploy cert-manager successfully.
The root problem is in Kubernetes networking if you use LoadBalancer that is provided by the hosting. I use DigitalOcean. Kubernetes is not routing network through LB public interface so there is no adding PROXY protocol header or SSL if you are setting it outside Kubernetes. I use PROXY protocol and the moment when I enable it and update Nginx to handle it everything works but cert-manager fails as it is trying to connect to public domain name and that fails. It works from my computer as I am outside and LB is adding needed headers, but not from within the cluster.
Cert-manager is not guilty for this, but if we can add some switches where we can instruct validator to add PROXY protocol instead to disable validation for that domain it would help some of us a lot.
For curl if I do (from inside the cluster):
curl -I https://myhost.domain.com
it fails.
If I do (from inside the cluster):
curl -I https://myhost.domain.com --haproxy-protocol
it works.
Check this: https://github.com/jetstack/cert-manager/issues/863
The root problem is in Kubernetes networking if you use LoadBalancer that is provided by the hosting. I use DigitalOcean. Kubernetes is not routing network through LB public interface so there is no adding PROXY protocol header or SSL if you are setting it outside Kubernetes. I use PROXY protocol and the moment when I enable it and update Nginx to handle it everything works but cert-manager fails as it is trying to connect to public domain name and that fails. It works from my computer as I am outside and LB is adding needed headers, but not from within the cluster.
Cert-manager is not guilty for this, but if we can add some switches where we can instruct validator to add PROXY protocol instead to disable validation for that domain it would help some of us a lot.
For curl if I do (from inside the cluster):
curl -I https://myhost.domain.com
it fails.
If I do (from inside the cluster):
curl -I https://myhost.domain.com --haproxy-protocol
it works.
Check this: #863
I was informed by DigitalOcean team that there is a fix for this behavior. They added an additional annotation to nxinx-ingress controller service that forces Kubernetes to use domain name of public IP instead of IP and that tricks Kubernetes to think that it is not "ours" and routes network around through LB.
https://github.com/digitalocean/digitalocean-cloud-controller-manager/blob/master/docs/controllers/services/examples/README.md#accessing-pods-over-a-managed-load-balancer-from-inside-the-cluster This is it: (I just added this one)
kind: Service
apiVersion: v1
metadata:
name: nginx-ingress-controller
annotations:
service.beta.kubernetes.io/do-loadbalancer-hostname: "hello.example.com"
Hello, i wanna up that issue, my home cluster is behind nat and hairpin not possible with current router.
From outside ingress ports fully avaliable and working, but from inside that not works.
i have error: Waiting for http-01 challenge propagation: failed to perform self check GET request 'http://domain/.well-known/acme-challenge/ACME': Get http://domain/.well-known/acme-challenge/ACME: dial tcp 109.173.40.107:80: connect: connection timed out
link avaliable by internal address (for example if i test via my PC).
Is there is any way to specify address for self-check? or just disable self-check.
It would also be nice if that could be disabled by on-certificate(request) base.
I have an issue with MetalLB + externalTrafficPolicy: Local
where the cert-manager validator cannot access the solver since it's running on a different node than the "proxy" forwarding the requests to the solver.
Any thoughts on this?
I have the same issue as @MatthiasLohr. I recently introduced MetalLB to our cluster and I wasn't expecting certificate requests to stop working.
Does anyone know any workarounds for this?
Note: I'd prefer to keep the self-check, it feels like a good thing to have. Maybe specifying a specific IP adress or Kubernetes Service
that should be used instead? This would work for me, for example:
curl -H "Host: master.my-site.com.stage.example.com" nginx-external.ingress.svc.cluster.local/.well-known/acme-challenge/UQEly9jJVXURz9ggFx_6Ckrc4OKT0uBBMUr-3oDsvDA
But that assumes that all my certificate requests for that resolves goes through the same Ingress
controller, of course.
EDIT: I assume it's overly complex (and something we don't want to do here) to look at the IP address and see if matches a loadBalancerIP
in the cluster, and if it is, use the clusterIP
instead?
Anton has volunteered to put a design document together for this feature! A big thank you - it'll be great to get input on this document once it's ready from those that require this feature! 😄
/assign @anton-johansson
What does that mean? Any ETA, when this feature will be available?
@MatthiasLohr I'm currently working on a design document where we can decide the best solution. It'll be up shortly.
Thank you! Would be really nice to have this feature as soon as possible, currently the last thing required for a production setup... Trying a lot of workarounds but nothing is really reliable.
I'll do my best to get this included in the v0.15.0
release.
Awesome, thanks! If I can help somehow, please let me know.
@MatthiasLohr @WhitePhoera I am in your exact same situation. In my case, to workaround this, I created an internal DNS Zone with an entry matching the cert and pointed it to the IP address managed by MetalLB. It's by no mean a long term solution but at least the certificate validated.
I cannot find this feature in the helm chart options of v0.15.0? @anton-johansson did you implement the feature with the latest release?