azure-functions-kafka-extension icon indicating copy to clipboard operation
azure-functions-kafka-extension copied to clipboard

Azure functions deployed in AKS cluster reading old messages from the Azure event hub.

Open vickithedeveloper opened this issue 2 years ago • 0 comments

Please find below the summary of the issue - What we are observing is that function app deployment in AKS with KAFKA based KEDA scaler is repeatedly reading the old events causing duplicates in processing. In some cases even when there is not input data, it continues to read and process old data. This leads to huge backlog over the time. case-1 One message was read and processed 18 times by the function app instance deployed in AKS.

case-2 In between processing new data it reads and process data that was enqueued few days back.

In some of the experiments that we conducted , it was observed that the volume of duplicates can go up to 48 %.

The host JSON is as follows - { "version": "2.0", "extensionBundle": { "id": "Microsoft.Azure.Functions.ExtensionBundle", "version": "[3.3.0, 4.0.0)" }, "extensions": { "kafka": { "maxBatchSize": 128, "SubscriberIntervalInSeconds": 1, "ExecutorChannelCapacity": 1, "ChannelFullRetryIntervalInMs": 100, "AutoCommitIntervalMs": 20000, "AutoOffsetReset": "latest", "SocketKeepaliveEnable" : "true", "prefetchCount": 256, "PYTHON_THREADPOOL_THREAD_COUNT": 32 } }, "logging": { "applicationInsights": { "samplingSettings": { "isEnabled": true, "excludedTypes": "Request" } } } }

The function.json is as follows - {   "scriptFile": "__init__.py",   "bindings": [     { "type" : "kafkaTrigger", "direction": "in", "name" : "events", "protocol" : "SASLSSL", "password" : "KAFKA_SRC_EVENTHUB_PASSWORD", "topic" : "eventhubone", "authenticationMode" : "PLAIN", "cardinality" : "MANY", "dataType": "binary", "consumerGroup" : "consumergroupname", "username" : "$ConnectionString", "BrokerList" : "KAFKA_SRC_EVENTHUB_ENDPOINT"     }, { "type": "kafka", "direction": "out", "name": "outputMessage", "brokerList": "KAFKA_DEST_EVENTHUB_ENDPOINT", "topic": "eventhubtwo", "username": "$ConnectionString", "password": "KAFKA_DEST_EVENTHUB_PASSWORD", "protocol": "SASLSSL", "authenticationMode": "PLAIN" } ] }

Required your help in solving the issue as we have tried experimenting with different values of commit interval and batch sizes as well. All the experiments are yielding the same result.

vickithedeveloper avatar Aug 24 '22 09:08 vickithedeveloper