strimzi-kafka-operator
strimzi-kafka-operator copied to clipboard
Python >= 3.13 clients fail to connect with self-signed TLS certs due to VERIFY_X509_STRICT
We are using Strimzi Kafka with authentication.type: tls and self-signed certificates.
Clients running on Python versions ≤3.12 have been able to connect without issues. However, after upgrading to Python >=3.13, connection attempts fail with the following error:
[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: Missing Authority Key Identifier (_ssl.c:1020)
This appears to be caused by a change introduced in Python 3.13, where ssl.create_default_context() now includes the VERIFY_X509_STRICT flag by default:
https://docs.python.org/3/whatsnew/3.13.html#ssl
Note: VERIFY_X509_STRICT may reject pre-RFC 5280 or malformed certificates that the underlying OpenSSL implementation might otherwise accept.
A workaround is to disable the flag manually:
import ssl
ctx = ssl.create_default_context()
ctx.verify_flags &= ~ssl.VERIFY_X509_STRICT
However, this is not ideal as it reduces the level of certificate validation. The issue is not related to encryption but rather strict RFC-5280 compliance — in particular, the absence of an Authority Key Identifier in the CA certificate.
As more teams begin migrating to Python ≥3.13, this is becoming a more pressing and widespread issue.
Please consider updating the certificate generation process in Strimzi to produce RFC 5280-compliant certificates, or at least provide an option (e.g., via feature gate) to enable such compliance when needed.
Thanks in advance!
I found out that this problem is caused by broker certificates which don't have AKI now. When Python connects to Kafka it checks that brokers should have certificates which are signed by the same CA and SKI of CA should be equal to AKI of a broker certificate
I had to look into the RFC-5280 and get more info about the AKI. AFAIU it's not a mandatory thing but optional, but it seems that your Python client has it enabled by default. I also got some information about how the Java clients behave and it seems that Java truststore manager runs this validation mostly through the truststore (so looking there is there is the CA used to sign the certificate and it's trusted because in the store). So there is no usage of AKI there. Anyway, I am not against adding it if helps other clients to work. Of course, it doesn't have to break compatibility with others but taking into account that this validation is optional, other clients would just skip it without checking at the AKI extension. I also left a comment on the corresponding PR. Finally, I will leave the others to have an opinion on this.
If we'll move forward with this issue and the corresponding PR, we should also check that cert-manager issues certificates including the AKI. I couldn't find anything explicit in the cert-manager documentation but maybe it should be verified. @katheris anything you already know about this taking into account your work with cert-manager for Strimzi?
I wonder if this should have a feature gate to introduce it gradually -> because it can also happen that adding it will cause issues to someone else. FG would allow to make it optional first and give more time to others to adjust if needed. I'm not really a big expert on TLs, so not sure how likely it is it will cause problems somewhere. But given the number of environments, old clients, Java versions, OS versions etc., it is pretty hard for us to test it.
Guys, @ppatierno @scholzj, you are asking for the opposite things. Now I have a gradual approach - only broker and CC certificates are affected as @scholzj asks. But @ppatierno suggested to apply AKI to all certificates. Personally I prefer current way and if it is fine, I will do a PR for a newer version for other certificate types.
I am not sure this change really needs a FG. What this is going to add is a field AKI on the servers' certificates and it's then on the client side using this field to validate the issuer. I read that, for example, Java doesn't use it but just leverage the truststore (so the issuer is fine, if it's in the truststore). It seems that the Python client validates it and we don't know other clients of course. But AFAICS it's on client side validating, so if it's there it helps clients which use it for validation, but should not break clients which are not using it for validation. This is my understanding.
But @ppatierno suggested to apply AKI to all certificates. Personally I prefer current way and if it is fine,
I was suggesting (on the PR) to add it to the self-signed CA as well (where SKI will be the same as AKI) for consistency but I can live without it.
I will do a PR for a newer version for other certificate types.
Wdym? Which certificate types are you talking about?
I don't think we are asking for opposite things. I agree with Paolo on doing this for all certificates.
What I suggest (at least for consideration) is that this should be introduced through a feature gate to make sure it is opt-in first, then opt-out and only at the end it is enabled permanently for everyone.
PS: Please keep in mind that sometimes things need to be discussed and agreed on first. This normally happens during the issue triage on the community call.
On Tue, May 6, 2025, 07:01 o-afanasenko @.***> wrote:
o-afanasenko left a comment (strimzi/strimzi-kafka-operator#11375) https://github.com/strimzi/strimzi-kafka-operator/issues/11375#issuecomment-2853368261
Guys, @ppatierno https://github.com/ppatierno @scholzj https://github.com/scholzj, you are asking for the opposite things. Now I have a gradual approach - only broker and CC certificates are affected as @scholzj https://github.com/scholzj asks. But @ppatierno https://github.com/ppatierno suggested to apply AKI to all certificates. Personally I prefer current way and if it is fine, I will do a PR for a newer version for other certificate types.
— Reply to this email directly, view it on GitHub https://github.com/strimzi/strimzi-kafka-operator/issues/11375#issuecomment-2853368261, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABLFOR6UFI7WD7WMDV7RUMT25BF3XAVCNFSM6AAAAAB3U2KZOKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQNJTGM3DQMRWGE . You are receiving this because you were mentioned.Message ID: @.***>
Wdym? Which certificate types are you talking about?
I was talking about client certificates. But now I see you suggest to add AKI to the self-signed CA and to make this feature optional. I need some time to investigate how to do this.
This normally happens during the issue triage on the community call.
Should I wait for the result of the next discussion before proceeding?
@ppatierno I tried to add AKI to CA cert but on Mac it doesn't work. I did small research and this is confirmed in RFC “There is one exception; where a CA distributes its public key in the form of a ‘self-signed’ certificate, the authority key identifier MAY be omitted.” — RFC 3280, Section 4.2.1.1
@o-afanasenko I know but it says "MAY be omitted". Not working on Mac is not a reason for not having it. I can't see Kafka clusters or clients running on Mac in production imho. I would investigate why it's not working on Mac and if there is anything you need to do.
This normally happens during the issue triage on the community call.
Should I wait for the result of the next discussion before proceeding?
Yes, I think we should first clarify how we should proceed with this to avoid you updating the PR again and again everytime someone has a different opinion. If there is general agreement to use feature gate, it would likely also require a proposal (https://github.com/strimzi/proposals).
I know but it says "MAY be omitted". Not working on Mac is not a reason for not having it. I can't see Kafka clusters or clients running on Mac in production imho. I would investigate why it's not working on Mac and if there is anything you need to do.
I guess it depends on what does not work there. Assuming we are talking about OpenSSL adding the AKI to the CA, I would guess that if it does not work on MacOS it won't work on Linux (assuming it is really OpenSSL that is not working and not LibreSSL). If it would work on Linux, then I guess the only concern would be related unit tests, but I think we skipped them before on MacOS, so we might be able to deal with it the same way again.
However, there are absolutely Kafka clients in production use on MacOS. And likely even much more development. So assuming the AKI CA does not work in any clients on MacOS, I would say it is a blocker and the whole issue would be way more complicated.
I would guess that if it does not work on MacOS it won't work on Linux
Not a MacOS user, why this assumption?
However, there are absolutely Kafka clients in production use on MacOS. And likely even much more development. So assuming the AKI CA does not work in any clients on MacOS, I would say it is a blocker and the whole issue would be way more complicated.
AFAIU, the error that @o-afanasenko is facing on MacOS is about the process of generating the self-signed CA certificate with AKI. So it looks the Kube cluster is running somewhere on MacOS and the operator gets the error when generating the cert. It's not about clients validating. Also curious "there are absolutely Kafka clients in production use on MacOS" ... from where you get a statement like this? I can see clients on MacOS during development instead.
I had an empty AKI on MacOS for CA cert generation and a test failed but I don't think this is a blocker, because in before() there is a stopper for all SSL tests on MacOS (I commented out it for local development)
Assumptions.assumeTrue(System.getProperty("os.name").contains("nux"));
Certificate generation is executed on Linux in k8s and not connected with Mac OS or any other clients
I am trying to add AKI to CA certificate which is redundant IMHO. I just check tests results in CI/CD instead of local debugging.
When you have a discussion about this problem I think the main question is to decide where AKI should be added: CA certificates, client certificates or broker certificates only (for broker certs I already finished and it is required to close this issue)
I would guess that if it does not work on MacOS it won't work on Linux
Not a MacOS user, why this assumption?
I would expect OpenSSL to do the same on all operating systems it supports.
On the topic of cert-manager, there doesn't seem to be any support currently, but I've asked some of the project members whether they think it's something that might get added in future. I'll report back if an issue for it gets raised.
Triaged on 29.5.2025: @o-afanasenko are you still doing some kind of investigation around this? @katheris did you get some info from the cert-manager community?
This is already supported (i.e. for x509 Certs issued by in-tree cert-manager Issuer(s): Self-Signed, CA; for other Issuers results may vary because cert-manager-controller itself does not generate the certs in those cases) in upstream [atleast] cert-manager 1.15+.
Quick example to reproduce the field presence on py3.13:
$ oc create -f -
apiVersion: v1
kind: Namespace
metadata:
name: sandbox
---
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: selfsigned-issuer
spec:
selfSigned: {}
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: my-selfsigned-ca
namespace: sandbox
spec:
isCA: true
commonName: my-selfsigned-ca
secretName: root-secret
privateKey:
algorithm: ECDSA
size: 256
issuerRef:
name: selfsigned-issuer
kind: ClusterIssuer
group: cert-manager.io
---
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
name: my-ca-issuer
namespace: sandbox
spec:
ca:
secretName: root-secret
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: child-cert
namespace: sandbox
spec:
isCA: true
commonName: service-a.default.svc
secretName: service-a-cert
issuerRef:
name: my-ca-issuer
kind: Issuer
group: cert-manager.io
$ oc get secret -n sandbox service-a-cert -o json | jq '.data["tls.crt"]' -r | base64 -d > service-a-cert.pem
$ pyenv local 3.13
$ python3
Python 3.13.2 (main, Jun 3 2025, 00:41:39) [Clang 17.0.0 (clang-1700.0.13.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import ssl
>>> ssl._ssl._test_decode_cert("service-a-cert.pem")
{'subject': ((('commonName', 'service-a.default.svc'),),), 'issuer': ((('commonName', 'my-selfsigned-ca'),),), 'version': 3, 'serialNumber': '8659D93461EFC03235B35D7F57C8120A', 'notBefore': 'Jun 2 19:07:08 2025 GMT', 'notAfter': 'Aug 31 19:07:08 2025 GMT'}
>>> from cryptography import x509
>>> from cryptography.hazmat.backends import default_backend
>>> with open("service-a-cert.pem", "rb") as cert_file:
... cert_data = cert_file.read()
...
>>> cert = x509.load_pem_x509_certificate(cert_data, default_backend())
>>> aki = cert.extensions.get_extension_for_class(x509.AuthorityKeyIdentifier).value
>>> aki
<AuthorityKeyIdentifier(key_identifier=b'9\xfe\xc9t\xcf\xb4\x8fL\xf2\xd4\xdb\x97\xcf\x1e\xe2\xf5\x9c\xcf\x86\x97', authority_cert_issuer=None, authority_cert_serial_number=None)>
Yes as @swghosh stated, it looks like it's already supported in cert-manager, so no concerns there. Thanks @swghosh
@katheris I tried to add AKI for CA certificates but it is not possible to do this easily (I can add either a lot of code just to support only AKI for CA or add AKI to all certificates). So for now I think this PR is enough just to fix the basic mentioned problem. What do you mean by "no concerns"? Is my PR ready to merge or to decline?
What do you mean by "no concerns"? Is my PR ready to merge or to decline?
@o-afanasenko my comment was in reference to whether this would impact the work I am doing to integrate cert-manager with Strimzi. The comments about whether your PR is ready will be added to the pull request directly.
Triaged on 26.6.2025: We should keep it opened and @katheris could you please have a look on the PR from @o-afanasenko ? Thanks!
Triaged on 10.7.2025: we agreed that this issue would need a proposal for better discussion if having AKI is something needed just for broker certificates or it's better to cover all the certificates in the cluster, so including the CA and user ones. It could potentially need a feature-gate to enabling the AKI and we should discuss if the rolling out would happen on the upgrade or on the next generation of certificates. @o-afanasenko are you still interested in working on it and the proposal?
@ppatierno Yes, I am interested in working on it. @katheris About proposal I am not sure how to do it. Could you give me a link with examples please?
@o-afanasenko you can find all the Strimzi proposals in this repo https://github.com/strimzi/proposals You can start writing one from the template or anyway looking at how the others are structured.