Failed to list *v1beta1.Ingress error
Hi, today I runned kube-lego for the first time and everything worked fine but after a couple of hours or so I got the following logs the kube lego pod (see the error at the end):
time="2017-09-06T10:09:55Z" level=info msg="kube-lego 0.1.5-a9592932 starting" context=kubelego
time="2017-09-06T10:09:55Z" level=info msg="connecting to kubernetes api: https://10.31.240.1:443" context=kubelego
time="2017-09-06T10:09:55Z" level=info msg="successfully connected to kubernetes api v1.7.4" context=kubelego
time="2017-09-06T10:09:55Z" level=debug msg="start watching ingress objects" context=kubelego
time="2017-09-06T10:09:55Z" level=info msg="server listening on http://:8080/" context=acme
time="2017-09-06T10:09:55Z" level=debug msg="CREATE ingress/default/ingress" context=kubelego
time="2017-09-06T10:09:55Z" level=debug msg="worker: begin processing true" context=kubelego
time="2017-09-06T10:09:55Z" level=debug msg=reset context=provider provider=gce
time="2017-09-06T10:09:55Z" level=debug msg="UPDATE ingress/default/ingress" context=kubelego
time="2017-09-06T10:09:55Z" level=debug msg=finalize context=provider provider=gce
time="2017-09-06T10:09:55Z" level=debug msg="setting up svc endpoint" context=provider namespace=default pod_ip=10.28.2.7 provider=gce
time="2017-09-06T10:09:55Z" level=debug msg=reset context=provider provider=nginx
time="2017-09-06T10:09:55Z" level=debug msg=finalize context=provider provider=nginx
time="2017-09-06T10:09:55Z" level=info msg="disable provider no TLS hosts found" context=provider provider=nginx
time="2017-09-06T10:09:55Z" level=info msg="process certificate requests for ingresses" context=kubelego
time="2017-09-06T10:09:55Z" level=info msg="Attempting to create new secret" context=secret name=domain-secret-tls namespace=default
time="2017-09-06T10:09:55Z" level=info msg="no cert associated with ingress" context="ingress_tls" name=ingress namespace=default
time="2017-09-06T10:09:55Z" level=info msg="requesting certificate for <DOMAIN_NAME>" context="ingress_tls" name=ingress namespace=default
time="2017-09-06T10:09:55Z" level=info msg="Attempting to create new secret" context=secret name=lets-encrypt namespace=default
time="2017-09-06T10:09:57Z" level=info msg="if you don't accept the TOS (https://letsencrypt.org/documents/LE-SA-v1.1.1-August-1-2016.pdf) please exit the program now" context=acme
time="2017-09-06T10:09:57Z" level=info msg="created an ACME account (registration url: https://acme-v01.api.letsencrypt.org/acme/reg/20914123)" context=acme
time="2017-09-06T10:09:57Z" level=info msg="Attempting to create new secret" context=secret name=lets-encrypt namespace=default
time="2017-09-06T10:09:57Z" level=info msg="Secret successfully stored" context=secret name=lets-encrypt namespace=default
time="2017-09-06T10:17:59Z" level=debug msg="testing reachability of http://<DOMAIN_NAME>/.well-known/acme-challenge/_selftest" context=acme domain=<DOMAIN_NAME>
time="2017-09-06T10:18:01Z" level=debug msg="responding to challenge request" basePath="/.well-known/acme-challenge" context=acme host=<DOMAIN_NAME> token=McmuogMkzmOXu2yQxhDV4ai2QG6XszY5OR86z5SR6x8
time="2017-09-06T10:18:03Z" level=debug msg="got authorization: &{URI:https://acme-v01.api.letsencrypt.org/acme/challenge/s2j-HD4i1_6cSxGjCQvlJrmYb-acK9tOEZmAWumypCI/1924026229 Status:valid Identifier:{Type: Value:} Challenges:[] Combinations:[]}" context=acme domain=<DOMAIN_NAME>
time="2017-09-06T10:18:03Z" level=info msg="authorization successful" context=acme domain=<DOMAIN_NAME>
time="2017-09-06T10:18:04Z" level=info msg="successfully got certificate: domains=[<DOMAIN_NAME>] url=https://acme-v01.api.letsencrypt.org/acme/cert/04cc0fdc1bcc1aceab78d19f102f12cec7fc" context=acme
time="2017-09-06T10:18:04Z" level=debug msg="certificate pem data:\n-----BEGIN CERTIFICATE-----\n[XXX]\n-----END CERTIFICATE-----\n-----BEGIN CERTIFICATE-----\n[XXX]\n-----END CERTIFICATE-----\n" context=acme
time="2017-09-06T10:18:04Z" level=info msg="Attempting to create new secret" context=secret name=domain-secret-tls namespace=default
time="2017-09-06T10:18:04Z" level=info msg="Secret successfully stored" context=secret name=domain-secret-tls namespace=default
time="2017-09-06T10:18:04Z" level=debug msg="worker: done processing true" context=kubelego
time="2017-09-06T10:18:53Z" level=debug msg="UPDATE ingress/default/ingress" context=kubelego
time="2017-09-06T10:18:53Z" level=debug msg="worker: begin processing true" context=kubelego
time="2017-09-06T10:18:53Z" level=debug msg=reset context=provider provider=gce
time="2017-09-06T10:18:53Z" level=debug msg=finalize context=provider provider=gce
time="2017-09-06T10:18:53Z" level=debug msg="setting up svc endpoint" context=provider namespace=default pod_ip=10.28.2.7 provider=gce
time="2017-09-06T10:18:53Z" level=debug msg=reset context=provider provider=nginx
time="2017-09-06T10:18:53Z" level=debug msg=finalize context=provider provider=nginx
time="2017-09-06T10:18:53Z" level=info msg="disable provider no TLS hosts found" context=provider provider=nginx
time="2017-09-06T10:18:53Z" level=info msg="process certificate requests for ingresses" context=kubelego
time="2017-09-06T10:18:53Z" level=info msg="cert expires in 90.0 days, no renewal needed" context="ingress_tls" expire_time=2017-12-05 09:18:00 +0000 UTC name=ingress namespace=default
time="2017-09-06T10:18:53Z" level=info msg="no cert request needed" context="ingress_tls" name=ingress namespace=default
time="2017-09-06T10:18:53Z" level=debug msg="worker: done processing true" context=kubelego
time="2017-09-06T12:12:43Z" level=debug msg="token not found" basePath="/.well-known/acme-challenge" context=acme host=<DOMAIN_NAME> token="*"
time="2017-09-06T12:13:20Z" level=debug msg="token not found" basePath="/.well-known/acme-challenge" context=acme host=<DOMAIN_NAME> token="*"
time="2017-09-06T12:13:24Z" level=debug msg="token not found" basePath="/.well-known/acme-challenge" context=acme host=<DOMAIN_NAME> token=acme-challenge
E0906 12:54:11.437384 1 reflector.go:304] github.com/jetstack/kube-lego/pkg/kubelego/watch.go:112: Failed to watch *v1beta1.Ingress: Get https://10.31.240.1:443/apis/extensions/v1beta1/watch/ingresses?resourceVersion=1780&timeoutSeconds=509: dial tcp 10.31.240.1:443: getsockopt: connection refused
E0906 12:54:12.440617 1 reflector.go:201] github.com/jetstack/kube-lego/pkg/kubelego/watch.go:112: Failed to list *v1beta1.Ingress: Get https://10.31.240.1:443/apis/extensions/v1beta1/ingresses?resourceVersion=0: dial tcp 10.31.240.1:443: getsockopt: connection refused
E0906 12:54:13.442426 1 reflector.go:201] github.com/jetstack/kube-lego/pkg/kubelego/watch.go:112: Failed to list *v1beta1.Ingress: Get https://10.31.240.1:443/apis/extensions/v1beta1/ingresses?resourceVersion=0: dial tcp 10.31.240.1:443: getsockopt: connection refused
E0906 12:54:14.444683 1 reflector.go:201] github.com/jetstack/kube-lego/pkg/kubelego/watch.go:112: Failed to list *v1beta1.Ingress: Get https://10.31.240.1:443/apis/extensions/v1beta1/ingresses?resourceVersion=0: dial tcp 10.31.240.1:443: getsockopt: connection refused
E0906 12:54:45.445745 1 reflector.go:201] github.com/jetstack/kube-lego/pkg/kubelego/watch.go:112: Failed to list *v1beta1.Ingress: Get https://10.31.240.1:443/apis/extensions/v1beta1/ingresses?resourceVersion=0: dial tcp 10.31.240.1:443: i/o timeout
E0906 12:55:16.446610 1 reflector.go:201] github.com/jetstack/kube-lego/pkg/kubelego/watch.go:112: Failed to list *v1beta1.Ingress: Get https://10.31.240.1:443/apis/extensions/v1beta1/ingresses?resourceVersion=0: dial tcp 10.31.240.1:443: i/o timeout
E0906 12:55:47.447495 1 reflector.go:201] github.com/jetstack/kube-lego/pkg/kubelego/watch.go:112: Failed to list *v1beta1.Ingress: Get https://10.31.240.1:443/apis/extensions/v1beta1/ingresses?resourceVersion=0: dial tcp 10.31.240.1:443: i/o timeout
E0906 12:55:48.449934 1 reflector.go:201] github.com/jetstack/kube-lego/pkg/kubelego/watch.go:112: Failed to list *v1beta1.Ingress: Get https://10.31.240.1:443/apis/extensions/v1beta1/ingresses?resourceVersion=0: dial tcp 10.31.240.1:443: getsockopt: connection refused
E0906 12:55:49.451823 1 reflector.go:201] github.com/jetstack/kube-lego/pkg/kubelego/watch.go:112: Failed to list *v1beta1.Ingress: Get https://10.31.240.1:443/apis/extensions/v1beta1/ingresses?resourceVersion=0: dial tcp 10.31.240.1:443: getsockopt: connection refused
E0906 12:55:50.453445 1 reflector.go:201] github.com/jetstack/kube-lego/pkg/kubelego/watch.go:112: Failed to list *v1beta1.Ingress: Get https://10.31.240.1:443/apis/extensions/v1beta1/ingresses?resourceVersion=0: dial tcp 10.31.240.1:443: getsockopt: connection refused
and so on...
So looks like kube lego is losing its connection over the kubernetes API. However the connection URL was ok:
kubectl get services -o wide
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
kubernetes 10.31.240.1 <none> 443/TCP 6h <none>
nginx-service 10.31.245.135 <nodes> 80:30972/TCP 6h app=nginx
php-fpm-service 10.31.246.102 <nodes> 9000:30478/TCP 6h app=php-fpm
tls-certificates-renewal-service 10.31.249.43 <nodes> 8080:31905/TCP 6h <none>
Then I restarted the kube-lego deployment (ie. kubectl delete & kubectl apply) everything get back to normal again.
Before I saw this error, I noticed that the kubernetes cluster autoscaled up & down and gets unavailable a minute or so (saw the spinner in front the the cluster name in the Google Cloud admin UI). However no down time noticed. Maybe the kubernetes TLS certificates of the apiserver has been updated at some point (cluster update) and kube-lego was trying to connect to the kubernetes API using some deprecated certificates ?
This kind of "silent" error is an issue since domain certificates won't be updated if that error happens in the middle of the certificate validity period (and 90 days is a long enough period). Would it be difficult to make /healthz to also check the kubernetes API connection @munnerz ? So that kube lego is automatically restarted by kubernetes when that kind of error occur.
We just ran into this issue as well