confluent-kafka-python
confluent-kafka-python copied to clipboard
SSL communication and Segment fault issue in version 2.0.2 & above
Description
Facing issue while trying communication to Kafka over SSL via Admin Client.
Configuration: {'bootstrap.servers': 'X.X.X.X:X', 'security.protocol': 'ssl', 'ssl.ca.location': 'ca-cert-path'}
confluent-python version: 1.9.2 works perfect but same breaks when I upgrade to any of higher version for confluent python: 2.0.2, 2.1.1, 2.2.0 & 2.3.0. It's worth noting that each of confluent-dotnet version: 2.1.1, 2.2.0 & 2.3.0 with exact same configuration and certificate works perfectly.
%3|1702380036.110|FAIL|rdkafka#producer-1| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Broker did not provide a certificate (after 6m
s in state SSL_HANDSHAKE)
Debug logs: It says broker didn't provide certificate but same works with confluent python 1.9.2 and each confluent dotnet version I mentioned above. I have replaced actual broker IP's with keyword: broker_ip in below debug logs.
%7|1702383547.987|BROKER|rdkafka#producer-1| [thrd:app]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Added new broker with NodeId -1
%7|1702383547.987|CONNECT|rdkafka#producer-1| [thrd:app]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Selected for cluster connection: bootstrap servers added (broker has 0 connection attempt(s))
%7|1702383547.987|BRKMAIN|rdkafka#producer-1| [thrd::0/internal]: :0/internal: Enter main broker thread
%7|1702383547.987|BRKMAIN|rdkafka#producer-1| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Enter main broker thread
%7|1702383547.987|CONNECT|rdkafka#producer-1| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Received CONNECT op
%7|1702383547.987|STATE|rdkafka#producer-1| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Broker changed state INIT -> TRY_CONNECT
%7|1702383547.987|CONNECT|rdkafka#producer-1| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: broker in state TRY_CONNECT connecting
%7|1702383547.987|STATE|rdkafka#producer-1| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Broker changed state TRY_CONNECT -> CONNECT
%7|1702383547.987|INIT|rdkafka#producer-1| [thrd:app]: librdkafka v2.3.0 (0x20300ff) rdkafka#producer-1 initialized (builtin.features gzip,snappy,ssl,sasl,regex,lz4,sasl_plain,sasl_scram,plugins,zstd,sasl_oauth
bearer,http,oidc, STRIP STATIC_LINKING GCC GXX PKGCONFIG INSTALL GNULD LIBDL PLUGINS ZLIB SSL ZSTD CURL HDRHISTOGRAM SYSLOG SNAPPY SOCKEM SASL_SCRAM SASL_OAUTHBEARER OAUTHBEARER_OIDC CRC32C_HW, debug 0x4002)
%7|1702383547.988|CONNECT|rdkafka#producer-1| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Connecting to ipv4#broker-ip:9096 (ssl)
with socket 15
%7|1702383547.988|CONNECT|rdkafka#producer-1| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Connected to ipv4#broker-ip:9096
%7|1702383547.988|STATE|rdkafka#producer-1| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Broker changed state CONNECT -> SSL_HANDSHAKE
%7|1702383547.994|FAIL|rdkafka#producer-1| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Broker did not provide a certificate (after 6m
s in state SSL_HANDSHAKE) (_SSL)
%3|1702383547.994|FAIL|rdkafka#producer-1| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Broker did not provide a certificate (after 6m
s in state SSL_HANDSHAKE)
%7|1702383547.994|STATE|rdkafka#producer-1| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Broker changed state SSL_HANDSHAKE -> DOWN
%7|1702383547.994|STATE|rdkafka#producer-1| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Broker changed state DOWN -> INIT
%7|1702383548.012|BROKER|rdkafka#producer-2| [thrd:app]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Added new broker with NodeId -1
%7|1702383548.012|BRKMAIN|rdkafka#producer-2| [thrd::0/internal]: :0/internal: Enter main broker thread
%7|1702383548.012|BRKMAIN|rdkafka#producer-2| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Enter main broker thread
%7|1702383548.012|CONNECT|rdkafka#producer-2| [thrd:app]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Selected for cluster connection: bootstrap servers added (broker has 0 connection attempt(s))
%7|1702383548.013|CONNECT|rdkafka#producer-2| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Received CONNECT op
%7|1702383548.013|STATE|rdkafka#producer-2| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Broker changed state INIT -> TRY_CONNECT
%7|1702383548.013|CONNECT|rdkafka#producer-2| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: broker in state TRY_CONNECT connecting
%7|1702383548.013|STATE|rdkafka#producer-2| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Broker changed state TRY_CONNECT -> CONNECT
%7|1702383548.013|INIT|rdkafka#producer-2| [thrd:app]: librdkafka v2.3.0 (0x20300ff) rdkafka#producer-2 initialized (builtin.features gzip,snappy,ssl,sasl,regex,lz4,sasl_plain,sasl_scram,plugins,zstd,sasl_oauth
bearer,http,oidc, STRIP STATIC_LINKING GCC GXX PKGCONFIG INSTALL GNULD LIBDL PLUGINS ZLIB SSL ZSTD CURL HDRHISTOGRAM SYSLOG SNAPPY SOCKEM SASL_SCRAM SASL_OAUTHBEARER OAUTHBEARER_OIDC CRC32C_HW, debug 0x4002)
%7|1702383548.013|CONNECT|rdkafka#producer-2| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Connecting to ipv4#broker_ip:9096 (ssl)
with socket 19
%7|1702383548.013|CONNECT|rdkafka#producer-2| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Connected to ipv4#broker_ip:9096
%7|1702383548.013|STATE|rdkafka#producer-2| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Broker changed state CONNECT -> SSL_HANDSHAKE
%7|1702383548.014|CREATETOPICS|rdkafka#producer-2| [thrd:main]: CREATETOPICS worker called in state initializing: Success
%7|1702383548.014|ADMIN|rdkafka#producer-2| [thrd:main]: CREATETOPICS: looking up controller
%7|1702383548.014|CONNECT|rdkafka#producer-2| [thrd:main]: Not selecting any broker for cluster connection: still suppressed for 48ms: lookup controller
%7|1702383548.018|FAIL|rdkafka#producer-2| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Broker did not provide a certificate (after 5m
s in state SSL_HANDSHAKE) (_SSL)
%3|1702383548.018|FAIL|rdkafka#producer-2| [thrd:ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap]: ssl://cpkafkaittest.kafka.service.consul:9096/bootstrap: Broker did not provide a certificate (after 5m
s in state SSL_HANDSHAKE)
Do not suspect OpenSSL issue mentioned in post: https://github.com/confluentinc/confluent-kafka-python/issues/1521 as CIpher used is cipher TLS_AES_256_GCM_SHA384 . So don't think that it's a weak cipher issue, also confluent dotnet use same librdkafka for version 2.1.1, 2.2.0 and 2.3.0 which has OpenSSL >3.0 and it works fine over there with same certificate. Jus to rule out I tried setting ssl.providers=default,legacy but then I encountered segment error for each of confluent python version >=2.0.2
Python error: Segmentation fault
Current thread 0x00007fdd9b69df00 (most recent call first):
File "/opt/bitnami/python/lib/python3.8/site-packages/confluent_kafka/admin/__init__.py", line 122 in __init__
Saw related issue: https://github.com/confluentinc/confluent-kafka-python/issues/1547 where its mentioned it's fixed but I still see same segment issue.
Any help is highly appreciated.
How to reproduce
- Use confluent python 2.1.1 or higher version
- Create an Admi client using configuration :
{'bootstrap.servers': 'X.X.X.X:X', 'security.protocol': 'ssl', 'ssl.ca.location': 'ca-cert-path'} - try performing an Admin operation such as Check if Topic exists via validate option(e.g.
client.create_topics(topic_name, validate_only=True)), Create Topic and Fetch Metadata. - Certificate error would be encountered as provided in above segment.
- update configuration to include configuration
ssl.providers=default,legacy - re-run logic and segment error would be encountered.
Exists only in confluent python >=2.0.2. Same work fine for confluent python 1.9.2 and confluent versions >=2.1.1.
Checklist
Please provide the following information:
- [x] confluent-kafka-python and librdkafka version (
confluent_kafka.version()andconfluent_kafka.libversion()): >=2.0.2 - [x] Apache Kafka broker version: 3.5.1
- [x] Client configuration:
{'bootstrap.servers': 'X.X.X.X:X', 'security.protocol': 'ssl', 'ssl.ca.location': 'ca-cert-path'} - [x] Operating system: Linux Photon
- [x] Provide client logs (with
'debug': '..'as necessary) - [ ] Provide broker log excerpts
- [x] Critical issue
Hi @pranavrth
Will you be able to provide some guidance on this?
I'm experimenting similar issue on python 3.8.13 issue when using 'security.protocol': 'SSL' in the producer configuration.
import confluent_kafka
producer = confluent_kafka.Producer({'security.protocol': 'SSL'})
print(producer)
No issue with python 3.9.13 though
Thanks @WaxWell-Bison for the suggestion. It is working for us as well once we update the python version however it only works when I use the python version >= 3.12.0.
Is this issue still happening?