azure-functions-kafka-extension icon indicating copy to clipboard operation
azure-functions-kafka-extension copied to clipboard

Make connections.max.idle.ms configurable

Open PSanetra opened this issue 1 year ago • 3 comments

The Azure documentation recommends to set connections.max.idle.ms to 180000: https://learn.microsoft.com/en-us/azure/event-hubs/apache-kafka-configurations#producer-and-consumer-configurations

That option is exposed in the Confluent.Kafka.ClientConfig class via the ConnectionsMaxIdleMs property, but it seems like it is not possible to configure this using this library.

PSanetra avatar Dec 06 '22 10:12 PSanetra

@PSanetra could you please share your use-case we need to understand if this issue is not a blocker for you.

shrohilla avatar Dec 06 '22 15:12 shrohilla

@shrohilla I am not sure anymore, but can you ask someone at Microsoft why they recommend that setting?

PSanetra avatar Dec 06 '22 16:12 PSanetra

This thread provides some context around what this configuration does and why setting it under 180000 is recommended in Azure: https://github.com/confluentinc/confluent-kafka-dotnet/issues/1544.

This thread provides some more background on the issues that were previously experienced with Kafka and timeouts on idle TCP connections: https://github.com/confluentinc/librdkafka/issues/3109

The expected benefit of implementing this in Azure Functions would be a reduction in the logs that are thrown when the Kafka Client implemented in the Functions extension tries to communicate over an idle connection that has been closed by the Azure network. They commonly look like this and are frequently thrown because of metadata requests on idle topics: "Libkafka: [thrd:GroupCoordinator]: GroupCoordinator: xxx.servicebus.windows.net:9093: 1 request(s) timed out: disconnect (after 61269172ms in state UP, 1 identical error(s) suppressed)".

These errors don't cause Functions to fail because the underlying Kafka library just reopens a new connection when these errors occur. However, implementing this configuration in the Functions Extension would reduce the number of noisy timeout logs that are written from Kafka.

Some other related threads and issues: https://github.com/confluentinc/confluent-kafka-dotnet/issues/1544 https://github.com/Azure/azure-functions-kafka-extension/issues/239 https://github.com/Azure/azure-functions-kafka-extension/issues/197

Gebumgar avatar Jan 11 '23 04:01 Gebumgar