charts icon indicating copy to clipboard operation
charts copied to clipboard

Docu Improvement: Please don't hesitate to specify this hint into the clusterissuer/cert-manager Setup Guides

Open theyo-tester opened this issue 1 year ago • 0 comments

Is your feature request related to a problem?

I was just not able to issue and signe a certificate, even if everything seemed to work as expected. I have spent more than one day finding the issue and I almost gave up. I was already filling in a new Bug Report because I was not able to see/find a relevant hint it in the logs. While writing everything I knew about the issue I found a log entry that by a search led me to the issue and the solution to it.

At some point in the past I have specified an additional Domain in 'Network->Edit Global Configuration->Additional Domains' not knowing what the consequences would be. image

At the first glance, this config seems to be innocent but it is messing up the issuing of certificates by the cert manager big time!

The ?-Mark gives some hint about this but here is the clear explanation: Additional Domains will land in the /etc/resolv.conf as a search Domain. This leads to resolving domains like "https://acme-v02.api.letsencrypt.org/directory" to the localhost (which is the websecure entrypoint of traefik) IF you have a Wildcard entry for that domain, thus the certification validations of letsencrypt fails. I found the relevant error in the cert-manager-controller logs and it looks like this:

...setup.go:265] "failed to register an ACME account" err="Get \"https://acme-v02.api.letsencrypt.org/directory\": 
tls: failed to verify certificate: x509:  certificate is valid for <someRandomNumber>.traefik.default, 
not acme-v02.api.letsencrypt.org" logger="cert-manager.clusterissuers" 
resource_name="cert" resource_namespace="" resource_kind="ClusterIssuer" resource_version="v1" related_resource_name="cert-acme-clusterissuer-account-key" related_resource_namespace="ix-cert-manager" related_resource_kind="Secret"

The simple fix was:

  • to remove the Additional Domains entry, which removed the "search" line in /etc/resolv.conf
  • remove and reinstall the cert-manager. Maybe a remove is not needed in the first place, but, the resolve.conf from host has to be propagated somehow to the pod
  • re-apply/update the configs in clusterissuer and/or apps to trigger a new certificate issuance.

Describe the solution you'd like

Please specify in bold letters in the relevant guides https://truecharts.org/charts/premium/clusterissuer/how-to/ https://truecharts.org/charts/system/cert-manager/

That 'Network->Edit Global Configuration->Additional Domains' should remain empty! Or it should at least not point to the external public FQDN or to an domain name with a wildcard in place In other words, there should be no search-entry of a public domain.tld in the /etc/resolv.conf, that also has a *.domain.tld in place, defined in cloudflare f.i..

Otherwise the ssl certificate issuance will not work and you will not know why, until you dig deep int-o the relevant config.

Describe alternatives you've considered

the alternative would be frustrated users 😅, if it happens to be that they wrongly specified Additional search Domains.

Additional context

No response

I've read and agree with the following

  • [X] I've checked all open and closed issues and my request is not there.
  • [X] I've checked all open and closed pull requests and my request is not there.

theyo-tester avatar May 15 '24 13:05 theyo-tester