helm-charts
helm-charts copied to clipboard
rotation of `ca.crt`
What happened?
I deployed redpanda onto my Kubernetes cluster on "2023-10-01T17:18:42Z" and these certificates and secrets are being created:
Note that I used SelfSigned Issuer when deploying
$ kubectl get cert
NAME READY SECRET AGE
redpanda-default-cert True redpanda-default-cert 83d
redpanda-default-root-certificate True redpanda-default-root-certificate 83d
redpanda-external-cert True redpanda-external-cert 83d
redpanda-external-root-certificate True redpanda-external-root-certificate 83d
$ kubectl get secret
NAME TYPE DATA AGE
redpanda-default-cert kubernetes.io/tls 3 83d
redpanda-default-root-certificate kubernetes.io/tls 3 83d
redpanda-external-cert kubernetes.io/tls 3 83d
redpanda-external-root-certificate kubernetes.io/tls 3 83d
Since I need to connect to redpanda with TLS, I use the contents in redpanda-default-cert
secret for my clients where it has:
-
ca.crt
-
tls.crt
-
tls.key
However, while tls.crt
expires on 2028 (5 years), the ca.crt
expires on 2023 December 30th (3 months) but the the redpanda-default-cert
's description is as follows:
kind: Certificate
metadata:
annotations:
meta.helm.sh/release-name: redpanda
meta.helm.sh/release-namespace: cybotrade-redpanda
creationTimestamp: "2023-10-01T17:18:36Z"
generation: 1
labels:
app.kubernetes.io/component: redpanda
app.kubernetes.io/instance: redpanda
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: redpanda
helm.sh/chart: redpanda-5.6.17
name: redpanda-default-cert
namespace: cybotrade-redpanda
resourceVersion: "16951563"
uid: 88ba3450-a3b4-4818-a236-da48e6ac4fb0
spec:
dnsNames:
- redpanda-cluster.redpanda.cybotrade-redpanda.svc.cluster.local
- redpanda-cluster.redpanda.cybotrade-redpanda.svc
- redpanda-cluster.redpanda.cybotrade-redpanda
- '*.redpanda-cluster.redpanda.cybotrade-redpanda.svc.cluster.local'
- '*.redpanda-cluster.redpanda.cybotrade-redpanda.svc'
- '*.redpanda-cluster.redpanda.cybotrade-redpanda'
- redpanda.cybotrade-redpanda.svc.cluster.local
- redpanda.cybotrade-redpanda.svc
- redpanda.cybotrade-redpanda
- '*.redpanda.cybotrade-redpanda.svc.cluster.local'
- '*.redpanda.cybotrade-redpanda.svc'
- '*.redpanda.cybotrade-redpanda'
duration: 43800h0m0s
issuerRef:
group: cert-manager.io
kind: Issuer
name: redpanda-default-root-issuer
privateKey:
algorithm: ECDSA
size: 256
secretName: redpanda-default-cert
status:
conditions:
- lastTransitionTime: "2023-10-01T17:18:42Z"
message: Certificate is up to date and has not expired
observedGeneration: 1
reason: Ready
status: "True"
type: Ready
notAfter: "2028-09-29T17:18:42Z"
notBefore: "2023-10-01T17:18:42Z"
renewalTime: "2027-01-30T09:18:42Z"
revision: 1
meaning that it will only be renewed on 2027, but by then the ca.crt
will long be expired.
My question is how do I handle this? Do I need to restart my clients every 3 months to use the renewed ca.crt
?
What did you expect to happen?
Since redpanda is using cert-manager
, I expect it to renew the certs automatically and I shouldn't need to frequently restart my services every time the cert expires.
How can we reproduce it (as minimally and precisely as possible)?. Please include values file.
statefulset:
replicas: 3
storage:
persistentVolume:
enabled: true
size: 40Gi
Anything else we need to know?
No response
Which are the affected charts?
No response
Chart Version(s)
$ helm -n <redpanda-release-namespace> list
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
redpanda cybotrade-redpanda 5 2023-11-15 10:53:19.741789 +0000 UTC failed redpanda-5.6.17 v23.2.12
Cloud provider
I am using AWS EKS
I confirm that issue, also same behavior in my environment. Every 3 months I see redpanda pod being unhealthy (readiness probe failed) because of invalid cert. I have to manually recreate both redpanda-default-root-certificate and redpanda-default-cert to make it work.
Can you send me a quick snippet of your redpanda-default-root-certificate
certs durations? There may be something we are not taking into account.
What version of redpanda chart are you using? I have checked the CAs, both from the root and certs and all are by default set to expire 5 years.
This appears to be an unfortunate interaction of cert manager's "CA Bootstrapping" suggestion. See https://github.com/cert-manager/cert-manager/issues/5851 for more details.
You can force cert-manager to update the ca.crt
by deleting the secrets it creates. Be warned, there are some security caveats to this approach that are documented in the linked cert-manager issue.
As of helm chart 5.6.5
, the CA created for the issuer will be valid for 5 years.
This problem still occurs. Is there anything that can be done on redpandas side or is it cert-manager-only issue?
@ArturKokoszka Updating to the newest version of the helm chart (and then forcing cert-manager to update the ca.crt
) will increase the life time to 5 years.
The inability to perform an automatic rotation is a limitation of the "CA Bootstrapping" method within cert-manager but we did opt to deploy it that way.
Changing the default to something friendlier may have some unfortunate ramifications for existing installations so we'll need to think that through thoroughly. I will say, it's difficult to have a default TLS solution that "just works". Every option comes with some degree of caveats. To you and the users that 👍 'd your comment, what behavior are you most interested in? Do you need TLS to work and be secure by default or do you just need TLS to exist and never break in the default installation?
@chrisseto In our case, we're running into renewal issues on certificates where there seems to be some date misalignment. I've not done any thorough investigation and just thought that this issue may be somewhat related.
To add some context, we seem to get this odd situation:
kubectl describe certificate kafka-default-cert
outputs:
Not After: 2024-05-13T14:31:04Z
Not Before: 2024-02-13T14:31:04Z
Renewal Time: 2024-04-13T14:31:04Z
So that looks fine, but the actual certificate:
kubectl get secret kafka-default-cert -o jsonpath={.data."ca\.crt"} | base64 --decode | openssl x509 -text -noout
outputs:
Validity
Not Before: Jan 14 09:19:15 2024 GMT
Not After : Apr 13 09:19:15 2024 GMT
As shown above the Renewal Time
is after the certificates Not After
, so it means it can't be renewed. We currently need to do a cmctl renew kafka-default-cert
to fix this when it happens.
This could be a misconfiguration our side, but haven't done much diving into this issue yet. Anyhow, here's the Redpanda values / settings we have in relation to certs:
tls:
enabled: true
certs:
default:
caEnabled: true
duration: 2160h
@chrisseto I'm still facing the same issue on both of my enviromnents. I just had to delete both secrets and it's working again. Replying to your questions, we definetely need a TLS solution that is stable and never break.