team-container icon indicating copy to clipboard operation
team-container copied to clipboard

no Let's Encrypt certificate

Open pwannenmacher opened this issue 5 years ago • 12 comments

I don't get any Let's Encrypt certificate following your instructions.

The Setting:

  • New set up Hetzner CX21 Cloud Server (Ubuntu 18.04, I tried Debian 10, too)
DNS-configuration        
A        
Type Domain Name Address TTL  
A example.org 'IPv4-Address' 3599  
A www.example.org 'IPv4-Address' 3599  
A chat.example.org 'IPv4-Address' 3599  
A cloud.example.org 'IPv4-Address' 3599  
A video.example.org 'IPv4-Address' 3599  
         
AAAA        
Type Domain Name Address TTL  
AAAA example.org 'IPv6-Address' 3599  
AAAA www.example.org 'IPv6-Address' 3599  
AAAA chat.example.org 'IPv6-Address' 3599  
AAAA cloud.example.org 'IPv6-Address' 3599  
AAAA video.example.org 'IPv6-Address' 3599  
         
NS        
Type Domain Name NS TTL  
NS example.org <first_ns> 3599  
NS example.org <second_ns> 3599  
         
CAA        
Type Domain Name Value TTL Tag
CAA example.org letsencrypt.org 3599 issue
CAA example.org letsencrypt.org 3599 issuewild

Even after hours there is only the 'TRAEFIK DEFAULT CERT'...

pwannenmacher avatar Apr 16 '20 09:04 pwannenmacher

I'm seeing the same symptoms on a CX11 with Debian 10.

After installing the setup/router helm chart, a kubectl get pods results in:

NAME                         READY   STATUS              RESTARTS   AGE
svclb-traefik-9d4xg          2/2     Running             0          4s
landingpage-86fb86f6-qtcws   1/1     Running             0          4s
traefik-6bc795bfcd-g6dgz     1/1     Running             0          4s

Looking at the logs via kubectl logs -f traefik-6bc795bfcd-g6dgz reveals:

time="2020-04-16T09:53:13Z" level=error msg="Unable to obtain ACME certificate for domains \"www.redacted.org\": cannot get ACME client get directory at 'https://acme-staging-v02.api.letsencrypt.org/directory': Get \"https://acme-staging-v02.api.letsencrypt.org/directory\": dial tcp: i/o timeout" rule="Host(`www.redacted.org`) && Path(`/`)" routerName=default-ingressroute-landingpage-b6c1df3ebe77e8940f06@kubernetescrd providerName=default.acme

This is shown using the staging server, but the same happens when using production: true.

The interesting bit is the log detail…

cannot get ACME client get directory at 'https://acme-staging-v02.api.letsencrypt.org/directory'

…which says that https://acme-staging-v02.api.letsencrypt.org/directory can't be accessed from within the pod/container (it is in fact accessible from the outside host).

The process times out with:

dial tcp: i/o timeout

I have deactivated IPv6 on the host (since @jamct lists not tested with IPv6 as a known issue), no change.

sekdiy avatar Apr 16 '20 10:04 sekdiy

Meanwhile I managed to 'solve' my issue by completely bulldozing and recreating the VM. Traefik now manages to reach LE servers and certificates can be issued successfully.

I suspect docker and its networking setup, as this was the only difference between the two VMs (i.e. I used to use docker to orchestrate container based services before, now docker is only running in the background while k3s is the only orchestrator).

sekdiy avatar Apr 17 '20 12:04 sekdiy

@pwannenmacher can you aquire logs the way I described above (and post them here)?

sekdiy avatar Apr 17 '20 12:04 sekdiy

Using the same commands as you did...

root@team-cloud:~# kubectl get pods

NAME READY STATUS RESTARTS AGE svclb-traefik-hwd9k 2/2 Running 0 30h landingpage-5956bf99c6-9fqrv 1/1 Running 0 30h traefik-7f444457b7-tgz8r 1/1 Running 0 30h

root@team-cloud:~# kubectl logs -f traefik-7f444457b7-tgz8r

time="2020-04-16T05:58:52Z" level=info msg="Configuration loaded from flags." time="2020-04-16T05:59:02Z" level=error msg="Unable to obtain ACME certificate for domains "www.example.org": unable to generate a certificate for the domains [www.example.org]: acme: Error -> One or more domains had a problem:\n[www.example.org] acme: error: 400 :: urn:ietf:params:acme:error:connection :: Connection refused, url: \n" providerName=default.acme routerName=default-ingressroute-landingpage-2bbe0661726cab002909@kubernetescrd rule="Host(www.example.org) && Path(/)" [...]

  • example.org isn't the real domain name

pwannenmacher avatar Apr 17 '20 12:04 pwannenmacher

I get the same error as @pwannenmacher . I am running on a freshly installed Ubuntu 18.04 LTS.

I literally only executed the install.sh and the helm chart on that server.

time="2020-04-18T15:45:28Z" level=error msg="Unable to obtain ACME certificate for domains \"chat.fouskas.de\": unable to generate a certificate for the domains [chat.fouskas.de]: acme: Error -> One or more domains had a problem:\n[chat.fouskas.de] acme: error: 400 :: urn:ietf:params:acme:error:connection :: Connection refused, url: \n" providerName=default.acme rule="Host(`chat.fouskas.de`)" routerName=default-ingressroute-chat-team-chat-a1385f94d978037914fe@kubernetescrd

time="2020-04-18T15:45:31Z" level=error msg="Unable to obtain ACME certificate for domains \"ct-router.fouskas.de\": unable to generate a certificate for the domains [ct-router.fouskas.de]: acme: Error -> One or more domains had a problem:\n[ct-router.fouskas.de] acme: error: 400 :: urn:ietf:params:acme:error:connection :: Error getting validation data, url: \n" providerName=default.acme routerName=default-ingressroute-landingpage-0f0e02fe121f379b39d6@kubernetescrd rule="Host(`ct-router.fouskas.de`) && Path(`/`)"

time="2020-04-18T15:45:33Z" level=error msg="Unable to obtain ACME certificate for domains \"nextcloud.fouskas.de\": unable to generate a certificate for the domains [nextcloud.fouskas.de]: acme: Error -> One or more domains had a problem:\n[nextcloud.fouskas.de] acme: error: 400 :: urn:ietf:params:acme:error:connection :: Connection refused, url: \n" providerName=default.acme routerName=default-ingressroute-nextcloud-team-nextcloud-de3f7337f9ebe99caa84@kubernetescrd rule="Host(`nextcloud.fouskas.de`)"

I am new to the whole kubernetes thing and need a little bit of guidance in which logs to consult.

What I do not understand is, that I can connect to my services on port 80 or 443, however they are not listed using sudo ss -tulpen

Netid           State             Recv-Q            Send-Q                          Local Address:Port                          Peer Address:Port
udp             UNCONN            0                 0                                     0.0.0.0:8472                               0.0.0.0:*                ino:17302 sk:2 <->
tcp             LISTEN            0                 128                                 127.0.0.1:10248                              0.0.0.0:*                users:(("k3s-server",pid=569,fd=222)) ino:17758 sk:3 <->
tcp             LISTEN            0                 128                                 127.0.0.1:10249                              0.0.0.0:*                users:(("k3s-server",pid=569,fd=252)) ino:17234 sk:4 <->
tcp             LISTEN            0                 128                                 127.0.0.1:6444                               0.0.0.0:*                users:(("k3s-server",pid=569,fd=14)) ino:1001 sk:5 <->
tcp             LISTEN            0                 128                                 127.0.0.1:10256                              0.0.0.0:*                users:(("k3s-server",pid=569,fd=250)) ino:18936 sk:6 <->
tcp             LISTEN            0                 128                                 127.0.0.1:10010                              0.0.0.0:*                users:(("containerd",pid=685,fd=10)) ino:18651 sk:9 <->
tcp             LISTEN            0                 128                                         *:10250                                    *:*                users:(("k3s-server",pid=569,fd=224)) ino:18683 sk:a v6only:0 <->
tcp             LISTEN            0                 128                                         *:10251                                    *:*                users:(("k3s-server",pid=569,fd=175)) ino:16996 sk:b v6only:0 <->
tcp             LISTEN            0                 128                                         *:6443                                     *:*                users:(("k3s-server",pid=569,fd=5)) ino:995 sk:c v6only:0 <->
tcp             LISTEN            0                 128                                         *:10252                                    *:*                users:(("k3s-server",pid=569,fd=177)) ino:18486 sk:d v6only:0 <->
tcp             LISTEN            0                 128                                         *:31540                                    *:*                users:(("k3s-server",pid=569,fd=223)) ino:17968 sk:e v6only:0 <->
tcp             LISTEN            0                 128                                         *:31990                                    *:*                users:(("k3s-server",pid=569,fd=243)) ino:17969 sk:f v6only:0 <->

Rossojo avatar Apr 18 '20 16:04 Rossojo

@Rossojo Did you configure IPv6? We did not test this setup with IPv6 at the moment

jamct avatar Apr 20 '20 08:04 jamct

I indeed have setup IPv6 🤔 I will try without in the next couple of days.

Rossojo avatar Apr 20 '20 09:04 Rossojo

I have the same error, also made the mistake to setup ipv6.

Now i deactivated ipv6 in the system and removed the AAAA records. The errors are gone now but I still get a certificate warning. Does it simply take some time until I get the certificates? If so, how long?

trevor87 avatar Apr 21 '20 20:04 trevor87

Ok, I setup a fresh server and did not enable IPv6. Now everything seems to work as expected. Seems to me that this setup confirmably does create certificates using IPv6.

As an addition I also had to wait for the DNS records (AAAA) to be invalidated. Before that certificate creation resulted in timeout errors

Rossojo avatar Apr 22 '20 16:04 Rossojo

Thanks for your feedback. I'm working in IPv6 support at the moment!

jamct avatar Apr 22 '20 16:04 jamct

Similar problem here. Got to work the certificate for www.example.org but nextcloud under cloud.example.org was not showing up. I screwed Ubuntu, installed Debian and disabled ipv6 for eth0, adding the following lines to /etc/sysctl.conf: net.ipv6.conf.eth0.disable_ipv6 = 1 sysctl -p or restart

After 2 days of tinkering, the nextcloud shows up. YEAH! The experience of an "easy" all-in-one-solution" is degraded.

fatango avatar Apr 25 '20 09:04 fatango

not tested = not working :-(

did: sudo sysctl -w net.ipv6.conf.all.disable_ipv6=1 sudo sysctl -w net.ipv6.conf.default.disable_ipv6=1

and get certificate after reconnecting, because I was connected with ipv6 (off course, we have 2020). So this is a big issue :-)

beyerservice avatar Apr 26 '20 15:04 beyerservice