librdkafka icon indicating copy to clipboard operation
librdkafka copied to clipboard

APIVERSION_QUERY timeout in first connection with TLSv1.3 kafka broker

Open ZiqianXu opened this issue 1 year ago • 0 comments

Read the FAQ first: https://github.com/confluentinc/librdkafka/wiki/FAQ

Do NOT create issues for questions, use the discussion forum: https://github.com/confluentinc/librdkafka/discussions

Description

When configure kafka broker to TLSv1.3 with ssl.enabled.protocols=TLSv1.3, I noticed the libkafka would fail on the first APIVERSION_QUERY and force initiating of second connection to broker.

logs received from librdfafa customized librdkafka: RdKafka::EventCb:

log event: ssl://1.0.0.2:9093/bootstrap: Connected (#1)
log event: ssl://1.0.0.2:9093/bootstrap: Updated enabled protocol features +ApiVersion to ApiVersion
log event: ssl://1.0.0.2:9093/bootstrap: Broker changed state SSL_HANDSHAKE -> APIVERSION_QUERY
log event: Broadcasting state change
log event: ssl://1.0.0.2:9093/bootstrap: Sent ApiVersionRequest (v3, 40 bytes @ 0, CorrId 1)
log event: Topic packet-analysis-sessions metadata information unknown
log event: Topic packet-analysis-sessions partition count is zero: should refresh metadata
log event: Cluster connection already in progress: refresh unavailable topics
log event: Hinted cache of 1/1 topic(s) being queried
...
log event: Topic packet-analysis-sessions metadata information unknown
log event: Topic packet-analysis-sessions partition count is zero: should refresh metadata
log event: Cluster connection already in progress: refresh unavailable topics
log event: Hinted cache of 1/1 topic(s) being queried
log event: Skipping metadata refresh of 1 topic(s): refresh unavailable topics: no usable brokers
log event: Not selecting any broker for cluster connection: still suppressed for 49ms: no cluster connection
log event: ssl://1.0.0.2:9093/bootstrap: Timed out ApiVersionRequest in flight (after 10010ms, timeout #0)
log event: ssl://1.0.0.2:9093/bootstrap: ApiVersionRequest failed: Local: Timed out: probably due to broker version < 0.10 (see api.version.request configuration) (after 10010ms in state APIVERSION_QUERY) (_TRANSPORT)
log event: ssl://1.0.0.2:9093/bootstrap: ApiVersionRequest failed: Local: Timed out: probably due to broker version < 0.10 (see api.version.request configuration) (after 10010ms in state APIVERSION_QUERY)
log event: ssl://1.0.0.2:9093/bootstrap: Updated enabled protocol features -ApiVersion to
log event: ssl://1.0.0.2:9093/bootstrap: Broker changed state APIVERSION_QUERY -> DOWN
log event: Broadcasting state change
error event: 1/1 brokers are down
error event: ssl://1.0.0.2:9093/bootstrap: ApiVersionRequest failed: Local: Timed out: probably due to broker version < 0.10 (see api.version.request configuration) (after 10010ms in state APIVERSION_QUERY)

How to reproduce

In kafta server.properties:

ssl.protocol=TLS
ssl.enabled.protocols=TLSv1.3
ssl.truststore.location=<path>/kafka.server.truststore.jks
ssl.truststore.password=test1234
ssl.keystore.location=<path>/kafka.server.keystore.jks
ssl.keystore.password=test1234
ssl.key.password=test1234

Note: change ssl.enabled.protocols=TLSv1.2 does not have the issue, seem to be specific about TLSv.13

Checklist

IMPORTANT: We will close issues where the checklist has not been completed.

Please provide the following information:

  • [x] librdkafka version (release number or git tag): 2.1.1
  • [x] Apache Kafka version: 2.8.2
  • [x] Operating system: CentOS9

ZiqianXu avatar Jan 11 '24 23:01 ZiqianXu