kafkajs
kafkajs copied to clipboard
consumer stopped consuming after ERROR [Connection] Connection error: read ECONNRESET
Describe the bug We re using kafkaJs to consume and produce events to azure event hub with kafka API. And we noticed that all of our pods on AKS had stopped consuming after throwing this list of errors :
ERROR [Connection] Response Fetch(key: 1, version: 6) {"timestamp":"2022-03-17T07:47:18.361Z","logger":"kafkajs","broker":"ESUT1EVENTHUBS01.servicebus.windows.net:9093","clientId":"microservices/claim/consumer","error":"The server experienced an unexpected error when processing the request","correlationId":8074,"size":217}
ERROR [Consumer] Crash: KafkaJSNonRetriableError: The server experienced an unexpected error when processing the request {"timestamp":"2022-03-17T07:47:18.362Z","logger":"kafkajs","groupId":"claim-service-group","stack":"KafkaJSNonRetriableError: The server experienced an unexpected error when processing the request\n at /home/node/app/node_modules/kafkajs/src/retry/index.js:53:18\n at runMicrotasks (
I was wondering if i should caught "read ECONNRESET" error and restart the client in the client code or there is something i could leverage in the library. I know the library usually restart before given up on transient error and eventually throw EXONNRSET when it see KafkaJSNumberOfRetriesExceeded Expected behavior I expect the library to initiate a restart.
Observed behavior After raising the error the client did not crush and stayed without any activity (no message where consumed).
Environment:
OS: alpine:0.11.0 KafkaJS version: 1.15.0 Kafka version: Azure Event Hub PAAS node:12.22.10-alpine3.15
We saw the same thing on v1.16.0
and were initially listening for the consumer.crash
event to start a re-connect. However we also saw inconsistency with this event firing reliably, sometimes seeing errors like "Failed to execute listener"
. We're now listening for the consumer.heartbeat
event to check if it is healthy, and it has been working quite well.
Any update?
@mguay22 Below is the event I am getting in heartbeat. How can we say whether the connection is healthy or not?
{
id: 2,
type: 'consumer.heartbeat',
timestamp: 1685466252785,
payload: {
groupId: '<group-id>,
memberId: '<member-id>',
groupGenerationId: 1
}
}
encounter with the same issue any update on the same. i am using 2.2.4 kafkajs