zmstone
zmstone
Hi. I noticed in your logs that the generation number jumped from 39437 to 39439 It could be that generation 39438 had the other node elected as a leader but...
yes, try with a larger session timeout.
ah, the sync group request has a hard coded 5 seconds limit on waiting for response. if it takes longer than 5 seconds for the other member to rejoin (typically...
this was done only for join group request should do the same for sync group request ``` %% send join group request and wait for response %% as long as...
try my branch in the PR ?
it very much depends on how big your messages and message sets are. and how long it takes for the subscriber to complete one batch currently if it takes longer...
anything different in logs ?
Could you try a different `offset_commit_interval_seconds` ? 5 is the same as heartbeat rate which makes it hard to tell which request triggered the 'unknonw_member_id' error code.
what's your kafka version btw.
there is actually a (re)join failure log https://github.com/klarna/brod/blob/3.10.0/src/brod_group_coordinator.erl#L513 If you do not see such log as below, it actually joined the group successfully ``` failed to join group reason ......