openshift-acme
openshift-acme copied to clipboard
certs not updating. leader-election blocked by lock?
What happened:
- Previously working / updating Certificates not updating.
- Two instances of Openshift ACME running.
- One instance reporting this... I0103 00:24:46.571147 1 leaderelection.go:352] lock is held by openshift-acme-7f65979ff9-hgsz4_8f58d3f6-9cf7-4745-af7b-476b0505caa9 and has not yet expired I0103 00:24:46.571381 1 leaderelection.go:247] failed to acquire lease fg/acme-controller-locks
- Other instance reporting this...
I0103 00:24:16.217493 1 reflector.go:432] k8s.io/[email protected]/tools/cache/reflector.go:108: Watch close - *v1.Route total 0 items received
I0103 00:25:04.539294 1 reflector.go:432] k8s.io/[email protected]/tools/cache/reflector.go:108: Watch close - *v1.LimitRange total 0 items received
I0103 00:25:15.362207 1 reflector.go:432] k8s.io/[email protected]/tools/cache/reflector.go:108: Watch close - *v1.ReplicaSet total 0 items received
I0103 00:26:15.924614 1 reflector.go:432] k8s.io/[email protected]/tools/cache/reflector.go:108: Watch close - *v1.Service total 0 items received
I0103 00:27:04.606876 1 reflector.go:432] k8s.io/[email protected]/tools/cache/reflector.go:108: Watch close - *v1.ConfigMap total 2054 items received
I0103 00:27:30.959775 1 reflector.go:432] k8s.io/[email protected]/tools/cache/reflector.go:108: Watch close - *v1.LimitRange total 0 items received
I0103 00:27:55.497750 1 reflector.go:432] k8s.io/[email protected]/tools/cache/reflector.go:108: Watch close - *v1.Secret total 9 items received
What you expected to happen: Clean logs and certificates up to date.
How to reproduce it (as minimally and precisely as possible): Not sure.
Anything else we need to know?:
Environment:
- OpenShift/Kubernetes version (use
oc/kubectl version
): OKD 4.7.0
- Others:
@tnozicka
seeing this when loading the cert...
[I] jkassis@Jeremys-MBP ~ [124]> ws "wss://pubsub.shinetribe.media/connPut?ConnUUID=b3f0b2d8-f5f8-452c-83fc-c476ecb7a3df" 01.02 16:36
x509: certificate has expired or is not yet valid: current time 2022-01-02T16:36:11-08:00 is after 2022-01-02T01:42:28Z
[I] jkassis@Jeremys-MBP ~ [1]> 01.02 16:36
brought the pods down and the "leader election blocked" logs reappear. proceeding as if this is normal. looking at the certificate status, it appears that the cert is up for re-issue on 02-01, which seems odd given that the fetched cert has already expired.
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
creationTimestamp: '2021-10-04T02:24:53Z'
generation: 3
managedFields:
- apiVersion: cert-manager.io/v1
fieldsType: FieldsV1
fieldsV1:
'f:spec':
.: {}
'f:commonName': {}
'f:dnsNames': {}
'f:issuerRef':
.: {}
'f:kind': {}
'f:name': {}
'f:secretName': {}
manager: Mozilla
operation: Update
time: '2021-10-04T02:38:51Z'
- apiVersion: cert-manager.io/v1
fieldsType: FieldsV1
fieldsV1:
'f:spec':
'f:privateKey': {}
'f:status':
.: {}
'f:conditions': {}
'f:notAfter': {}
'f:notBefore': {}
'f:renewalTime': {}
'f:revision': {}
manager: controller
operation: Update
time: '2021-12-03T01:42:28Z'
name: pubsub-shinetribe-media
namespace: fg
resourceVersion: '307455716'
selfLink: /apis/cert-manager.io/v1/namespaces/fg/certificates/pubsub-shinetribe-media
uid: a528dc92-636c-40c8-862e-38dfa6986cc7
spec:
commonName: pubsub.shinetribe.media
dnsNames:
- pubsub.shinetribe.media
issuerRef:
kind: Issuer
name: le-wildcard-issuer
secretName: cert-pubsub-shinetribe-media
status:
conditions:
- lastTransitionTime: '2021-10-04T02:42:30Z'
message: Certificate is up to date and has not expired
observedGeneration: 3
reason: Ready
status: 'True'
type: Ready
notAfter: '2022-03-03T00:44:07Z'
notBefore: '2021-12-03T00:44:08Z'
renewalTime: '2022-02-01T00:44:07Z'
revision: 3
Seems like the algo that determines the renewal time is broken?!? Here's what my browser gets for that cert... roughly 1D off.
I believe problem has been there all along. Forced to delete the Pods once in a while to ensure renewal process gets triggered.
encountering this issue as well. have tried force deleting the pods and bringing running pods down to 0 and bringing it back up but lock still held by some ghost