Support certificate chains when using custom certificates on the transport layer
Operator Version
2.16.1
K8s Cluster Details
version: 1.30
distribution: Amazon EKS
Facts
The ECK operator logs this error on trying to create an Elasticsearch resource where the TLS certificates are obtained by cert-manager from Let's Encrypt:
only expected one PEM formated CA certificate in <namespace>/<secret-name>
The relevant log event looks like so:
{
"log.level": "error",
"@timestamp": "2025-03-25T10:20:43.840Z",
"log.logger": "manager.eck-operator",
"message": "Reconciler error",
"service.version": "2.16.1+1f74bdd9",
"service.type": "eck",
"ecs.version": "1.4.0",
"controller": "elasticsearch-controller",
"object": {
"name": "eck-qs",
"namespace": "elasticsearch-clusters"
},
"namespace": "elasticsearch-clusters",
"name": "eck-qs",
"reconcileID": "ac688d7a-3448-4fae-87cd-5b6ae5a16e8d",
"error": "only expected one PEM formated CA certificate in elasticsearch-clusters/eck-qs-tls",
"errorCauses": [
{
"error": "only expected one PEM formated CA certificate in elasticsearch-clusters/eck-qs-tls",
"errorVerbose": "only expected one PEM formated CA certificate in elasticsearch-clusters/eck-qs-tls\ngithub.com/elastic/cloud-on-k8s/v2/pkg/controller/common/certificates.parseCAFromSecret\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/controller/common/certificates/ca_secret.go:56\ngithub.com/elastic/cloud-on-k8s/v2/pkg/controller/common/certificates.ParseCustomCASecret\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/controller/common/certificates/ca_secret.go:32\ngithub.com/elastic/cloud-on-k8s/v2/pkg/controller/elasticsearch/certificates/transport.ReconcileOrRetrieveCA\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/controller/elasticsearch/certificates/transport/ca.go:77\ngithub.com/elastic/cloud-on-k8s/v2/pkg/controller/elasticsearch/certificates.ReconcileTransport\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/controller/elasticsearch/certificates/reconcile.go:112\ngithub.com/elastic/cloud-on-k8s/v2/pkg/controller/elasticsearch/driver.(*defaultDriver).Reconcile\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/controller/elasticsearch/driver/driver.go:234\ngithub.com/elastic/cloud-on-k8s/v2/pkg/controller/elasticsearch.(*ReconcileElasticsearch).internalReconcile\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/controller/elasticsearch/elasticsearch_controller.go:298\ngithub.com/elastic/cloud-on-k8s/v2/pkg/controller/elasticsearch.(*ReconcileElasticsearch).Reconcile\n\t/go/src/github.com/elastic/cloud-on-k8s/pkg/controller/elasticsearch/elasticsearch_controller.go:186\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Reconcile\n\t/root/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:116\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler\n\t/root/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:303\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem\n\t/root/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:263\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func2.2\n\t/root/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:224\nruntime.goexit\n\t/usr/lib/go/src/runtime/asm_amd64.s:1700"
}
],
"error.stack_trace": "sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler\n\t/root/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:316\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem\n\t/root/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:263\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func2.2\n\t/root/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:224"
}
Impact
This prevents Elasticsearch from being deployed when using TLS certificates issued by cert-manager with Let's Encrypt.
Per Elastic's documentation, tls.crt can contain a certificate chain. However, the ECK operator enforces a stricter requirement, rejecting secrets with more than one PEM-formatted certificate.
Details
We're trying to create an Elasticsearch resource with the API kind: elasticsearch.k8s.elastic.co/v1 with the following spec:
auth:
disableElasticUser: true
fileRealm:
- secretName: quickstart-file-realm-users
http:
service:
metadata: {}
spec: {}
tls:
certificate:
secretName: eck-qs-tls
selfSignedCertificate:
disabled: true
monitoring:
logs: {}
metrics: {}
nodeSets:
- config:
node.store.allow_mmap: false
count: 3
name: default
remoteClusterServer: {}
transport:
service:
metadata: {}
spec: {}
tls:
certificate:
secretName: eck-qs-tls
certificateAuthorities: {}
selfSignedCertificates:
disabled: true
updateStrategy:
changeBudget: {}
version: 8.17.3
The certificate secret being referred to in the spec above is generated by a Certificate resource controlled by cert-manager, and the certificate is issued by Let's Encrypt.
dnsNames:
- <our ES's DNS>
duration: 2160h0m0s
issuerRef:
kind: ClusterIssuer
name: letsencrypt
privateKey:
algorithm: RSA
encoding: PKCS1
size: 2048
renewBefore: 360h0m0s
secretName: eck-qs-tls
subject:
We're able to verify that the relevant secret gets created, and has these two keys:
kubectl get secrets -n elasticsearch-clusters eck-qs-tls -o yaml | yq -r '.data | keys'
- tls.crt
- tls.key
tls.crt is a chain of certificates, and it looks like so:
-----BEGIN CERTIFICATE-----
<PEM>
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
<PEM>
-----END CERTIFICATE-----
The keys in this secret are per the requirements stated here.
Digging Around
On digging around, I found this go function:
func parseCAFromSecret(s corev1.Secret, keyFileName string, crtFileName string) (*CA, error) {
// Validate private key
key, exist := s.Data[keyFileName]
if !exist {
return nil, pkgerrors.Errorf("can't find private key %s in %s/%s", keyFileName, s.Namespace, s.Name)
}
privateKey, err := ParsePEMPrivateKey(key)
if err != nil {
return nil, pkgerrors.Wrapf(err, "can't parse private key %s in %s/%s", keyFileName, s.Namespace, s.Name)
}
// Validate CA certificate
cert, exist := s.Data[crtFileName]
if !exist {
return nil, pkgerrors.Errorf("can't find certificate %s in %s/%s", crtFileName, s.Namespace, s.Name)
}
pubKeys, err := ParsePEMCerts(cert)
if err != nil {
return nil, pkgerrors.Wrapf(err, "can't parse CA certificate %s in %s/%s", crtFileName, s.Namespace, s.Name)
}
if len(pubKeys) != 1 {
return nil, pkgerrors.Errorf("only expected one PEM formated CA certificate in %s/%s", s.Namespace, s.Name)
}
return NewCA(privateKey, pubKeys[0]), nil
}
This is the block that results in that error being logged:
if len(pubKeys) != 1 {
return nil, pkgerrors.Errorf("only expected one PEM formated CA certificate in %s/%s", s.Namespace, s.Name)
}
I think this completely opposite to the documentation on this matter, which states that tls.crt can be a certificate or a chain.
I intially thought this was a bug but the problem stems from the fact that you are configuring the same TLS secret you use for the HTTP layer of Elasticsearch also for the transport layer:
transport:
service:
metadata: {}
spec: {}
tls:
certificate:
secretName: eck-qs-tls
Our documentation on the usage in this location says:
You can use a Kubernetes secret to provide your own CA instead of the self-signed certificate that ECK will then use to create node certificates for transport connections. The CA certificate must be stored in the secret under ca.crt and the private key must be stored under ca.key.
We currently do not support chains in the custom CAs that you can configure there. The API documentation you found for the TLSOptions applies only to the HTTP layer. The correct documentation for the transport layer is here
The "Appears in" is intended to help figuring out where which configuration applies:
To fix your setup you would need to either remove the transport.tls.certificate section and allow ECK to use self-signed certificates on the transport layer (which should normally not matter to you or any of your ES clients) or configure a secret with ca.crt (single cert no chain) and ca.key
We currently do not support chains in the custom CAs that you can configure there
Why does TransportTLSOptions not support chains? Is there a specific reason?
We currently do not support chains in the custom CAs that you can configure there
Why does TransportTLSOptions not support chains? Is there a specific reason?
I don't think there is a specific reason other than the current limitations in the code. The transport layer cert handling was originally not designed to be user customisable so this restriction seemed acceptable at the time. We can look into loosening this restriction.