helm-charts
helm-charts copied to clipboard
Setup cluster with TLS using a K8s secret doesn't work
What happened?
My network is internal in my EKS cluster and so I cannot use cert-manager to create a Let's Encrypt certificate. So I have used certbot to create the certificates to my brokers with a DNS challenge which I completed manually.
I then went ahead and inserted the certificate files into a K8s secret just as the documentation asks to do. I went ahead and fired up a cluster referrencing the secret in the external listeners section
What did you expect to happen?
I expected the cluster to consume the secret content and be able to configure the TLS of the brokers correctly. but instead, the cluster didn't come up and the logs from the pods said that there is no such file for the ca.crt file even though I've made a certificate using Let's Encrypt, therefore a public CA not needing to specify a ca.crt file
I also tried to mitigate this by trying to create a generic secret and putting the supplied chain (CA.crt) of Let's encrypt which was provided with the certificate by certbot and tried referncing also that generic secret but again, the cluster won't come up.
Esentially I'm not able to setup a TLS enabled cluster in any other mode other than using a self signed certificate by cert-manager.
How can we reproduce it (as minimally and precisely as possible)?. Please include values file.
apiVersion: cluster.redpanda.com/v1alpha2
kind: Redpanda
metadata:
name: redpanda
spec:
chartRef:
chartVersion: 5.9.6
clusterSpec:
external:
enabled: true
domain: redpanda.example.com
type: NodePort
tls:
enabled: true
certs:
external:
secretRef:
name: rp-data-tls
auth:
sasl:
enabled: true
users:
- name: superuser
password: secretpassword
storage:
persistentVolume:
enabled: true
storageClass: gp2 // just an example
Anything else we need to know?
No response
Which are the affected charts?
No response
Chart Version(s)
5.9.6
Cloud provider
JIRA Link: K8S-394
Could you share the logs from redpanda when it's refusing to start up and the redacted contents of your secret, really just the keys would be fine?
Just saw now that you replied - of course!
Let's start by the secret , so I'm creating a TLS type secret using the method supplied in the documentation like so:
kubectl create secret tls rp-data-tls \
--cert=certs/tls.crt \
--key=certs/tls.key \
--namespace redpanda-data
The files are the actual certificate which I've made with certbot - I copied those files and renamed them from the following available files that I got when using certbot:
I renamed the cert.pem to tls.crt and privkey.pem to tls.key and just referenced those files here
The logs of the nodes when they try to come up complain about not having the ca.crt availble (As if red panda is expecting it) even though this certificate was created with a public CA (let's encrypt using certbot cli for macOS)
The private key redacted content is:
-----BEGIN PRIVATE KEY-----
M
[...REDACTED]
88
-----END PRIVATE KEY-----
and the tls.crt which I've used looks something like this:
-----BEGIN CERTIFICATE-----
MIIEHTCCA6KgAwIBAgISA5tlmfIeVm/VqHhsg/lLFB+pMAoGCCqGS
CzAJBgNVBAYTAlVTMRYwFAYDVQQKEw1MZXQncyBFbmNyeXB0
[...REDACTED]
th4iVJfjNpZlf7U1VWEoBhYwqU4Mb7rolmPEN4gptQcBSOgqEOh18
ALBwQumX89lBh5dGtNGgCbgBPNVd6HMyqSKF7ob4IBnbbg7UR
Pg==
-----END CERTIFICATE-----
My yaml for creating the cluster is the following:
apiVersion: cluster.redpanda.com/v1alpha2
kind: Redpanda
metadata:
name: redpanda
spec:
chartRef:
chartVersion: 5.9.9
clusterSpec:
nodeSelector:
purpose: "red-panda-cluster"
external:
enabled: true
domain: redpanda-data.k8s.staging.mycompanydomain.com
type: NodePort
tls:
enabled: true
certs:
external:
secretRef:
name: rp-data-tls
auth:
sasl:
enabled: true
users:
- name: superuser
password: secretpassword
storage:
persistentVolume:
enabled: true
storageClass: csi-driver-ebs-gp3
INFO 2024-11-10 12:46:40,982 [shard 0:main] cluster - Using index based node ID {2}
INFO 2024-11-10 12:46:40,983 [shard 0:main] main - application.cc:2611 - Starting Redpanda with node_id 2, cluster UUID {nullopt}
INFO 2024-11-10 12:46:40,983 [shard 0:main] raft - coordinated_recovery_throttle.cc:126 - Starting recovery throttle, rate: 104857600
INFO 2024-11-10 12:46:40,983 [shard 0:main] cluster - producer_state_manager.cc:45 - Started producer state manager
INFO 2024-11-10 12:46:40,983 [shard 0:main] main - application.cc:1583 - Partition manager started
INFO 2024-11-10 12:46:40,983 [shard 0:main] security - authorizer.h:273 - Registered superuser account: type {user} name {kubernetes-controller}
INFO 2024-11-10 12:46:40,983 [shard 0:main] security - authorizer.h:273 - Registered superuser account: type {user} name {superuser}
INFO 2024-11-10 12:46:40,983 [shard 0:main] main - application.cc:1671 - Archiver service setup, cloud_storage_enabled: false, legacy_upload_mode_enabled: true
INFO 2024-11-10 12:46:40,983 [shard 0:main] resource_mgmt - storage.cc:182 - Setting new target log data size 11.711GiB. Disk size 19.518GiB reservation percent 25 target percent {80} bytes {nullopt}
INFO 2024-11-10 12:46:40,987 [shard 0:main] rpc - server.cc:284 - vectorized internal rpc protocol - Stopping 1 listeners
INFO 2024-11-10 12:46:40,987 [shard 0:main] rpc - server.cc:296 - vectorized internal rpc protocol - Shutting down 0 connections
INFO 2024-11-10 12:46:40,987 [shard 0:main] resource_mgmt - storage.cc:88 - Stopping disk space manager service
INFO 2024-11-10 12:46:40,987 [shard 0:main] auditing - audit_log_manager.cc:792 - Shutting down audit log manager
INFO 2024-11-10 12:46:40,987 [shard 0:main] auditing - audit_log_manager.cc:563 - stop() invoked on audit_sink
INFO 2024-11-10 12:46:40,987 [shard 0:main] auditing - audit_log_manager.cc:600 - Setting auditing enabled state to: false
INFO 2024-11-10 12:46:40,987 [shard 0:main] auditing - audit_log_manager.cc:654 - Ignored update to audit_enabled(), auditing is already disabled
INFO 2024-11-10 12:46:40,988 [shard 0:main] raft - coordinated_recovery_throttle.cc:134 - Stopping recovery throttle
INFO 2024-11-10 12:46:40,988 [shard 0:main] kvstore - kvstore.cc:125 - Stopping kvstore: dir /var/lib/redpanda/data/redpanda/kvstore/0_0
INFO 2024-11-10 12:46:41,022 [shard 0:main] main - application.cc:461 - Shutdown complete.
ERROR 2024-11-10 12:46:41,022 [shard 0:main] main - application.cc:487 - Failure during startup: std::__nested<std::runtime_error> (Could not read trust file /etc/tls/certs/external/ca.crt): std::__1::__fs::filesystem::filesystem_error (error system:2, filesystem error: open failed: No such file or directory ["/etc/tls/certs/external/ca.crt"])