rdkafka-dotnet icon indicating copy to clipboard operation
rdkafka-dotnet copied to clipboard

infinite Loop and Error

Open andysim3d opened this issue 8 years ago • 5 comments

Hey,

I am developing a tiny tool with rdkafka-dotnet library. It works fine at most situations.

But I find an issue:

if I set the brokerlist to an non-existing broker, and try to produce any massage, it will create a task, and trapped in infinite loop trying produce the message. Mean time, it will raise OnError event.

I can't terminate the task inside OnError event handler, neither throw any exception from OnError event handler, and my application will crash because I have Unhandled user exception. But if I don't process it, it will keep producing message to that non-existing brokers.

I think in the Topic.cs, we could set a variable like 3 as maximum try time so it will not trapped into infinite loop when non-existing broker was assign to rdkafka-dotnet client.

andysim3d avatar Sep 21 '16 19:09 andysim3d

Hi Andrew, thanks for your bug report and pull request!

Produce will keep trying until message.timeout.ms passes. It's 5 minutes by default, I don't don't remember if that includes retries. So the maximum try time already exists, it's just a bit longer than you expect by default.

Could you check if that works for you? Maybe reduce message.timeout.ms if you don't want to wait that long.

In the future there will also be a flag that will let you handle send errors yourself, see https://github.com/ah-/rdkafka-dotnet/issues/52

ah- avatar Sep 21 '16 22:09 ah-

Hi Ah-

The message.timeout.ms is only set for the producer produce message. in the PR #63, you could see I modify the infinite loop with a maximum try time.

The problem is not caused by the message.timeout.ms configuration(And I just tried, it not solve my problem). It's because I try to use null as broker, so the buffer size of it is 0. once I try to produce message, it will call librdkafka to produce. And after checking code of librdkakfa, it will treat "NOBUF" condition as "_QUEUE_FULL". ( Check here: https://github.com/edenhill/librdkafka/blob/8ac96f78756aa765e2263df8da5f93072aeb0552/src/rdkafka.c#L427 ). And in your logic, produce method will not jump out from while loop if error code is _QUEUE_FULL. So in my scenario, that produce thread will trapped in infinite while loop.

I submit a PR #63 to fix this bug, please check it.

andysim3d avatar Sep 22 '16 00:09 andysim3d

Hey Ah

My colleagues tried the time out for serval hours, but it still looping.

I also tried change the message.timeout.ms to 1000(1 sec), but it still not working.

I'll try if the flag works for me tomorrow. Thanks for your reply.

在 2016年9月21日,下午6:56,Andreas Heider [email protected] 写道:

Hi Andrew, thanks for your bug report and pull request!

Produce will keep trying until message.timeout.ms passes. It's 5 minutes by default, I don't don't remember if that includes retries. So the maximum try time already exists, it's just a bit longer than you expect by default.

Could you check if that works for you? Maybe reduce message.timeout.ms if you don't want to wait that long.

In the future there will also be a flag that will let you handle send errors yourself, see #52

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

andysim3d avatar Sep 22 '16 03:09 andysim3d

Hi, thanks for checking. I'm busy today but will have a closer look tomorrow.

ah- avatar Sep 22 '16 06:09 ah-

Hi, from 0.9.2-ci-177 onwards you can now use topic.Produce(data, blockIfQueueFull: false); to tell RdKafka that you would like to handle a full local queue yourself. It will throw an RdKafkaException with error _QUEUE_FULL and not deliver the message in that case.

ah- avatar Nov 09 '16 23:11 ah-