librdkafka
librdkafka copied to clipboard
Revert 8e20e1ee (#4117) to fix hang in destruction of groupconsumer
We observed that destroying a groupconsumer would often hang waiting for the broker thread to exit. We tediously bisected the problem to the specific commit 8e20e1ee (the last commit before the v2.0.0rc1 tag). Only then did we find that a lot of people on GitHub were already complaining about that commit as introducing a resource leak: the commit adds a call to rd_kafka_toppar_keep
that bumps the refcount of the toppar, and I don't immediately see a corresponding rd_kafka_toppar_destroy
anywhere.
Reverting 8e20e1ee (as in this commit) does fix the hang in groupconsumer destruction which we were observing, so we've applied this patch to our downstream library.
Fixes #4486.
Hello @Quuxplusone , thanks for investigating this issue, the solution isn't reverting the commit as you see there were failing tests that were fixed.
The rd_kafka_toppar_destroy
is usually called here.
But that happens when the op is destroyed, maybe there are cases where the BARRIER op isn't destroyed. I have found a similar refcnt issue in test 0113, subtest n_wildcard
, but happening sporadically, and there a topic is deleted. Does it happen to you when a topic is deleted too?