librdkafka icon indicating copy to clipboard operation
librdkafka copied to clipboard

Cannot consume when a topic has more than 3 partitions: "Broker: Not coordinator" error

Open mdorier opened this issue 8 months ago • 4 comments

I am not sure whether this is a librdkafka issue or a Kafka issue but (1) I'm using the default server.properties file from Kafka and (2) everything works fine when using Kafka's kafka-console-consumer.sh instead of librdkafka, so I suspect a problem with librdkafka (or with the way I use it).

I setup 1 Kafka (version 2.13-3.8.0) broker using Kafka's config/kraft/server.properties file (not changing anything in it). I use librdkafka in a first application to create a topic with 4 partitions (replication factor 1), and I produce 1 million events in each partition.

I then use librdkafka in a second application to create a consumer and consume from partition 0 (in practice I will have as many consumers as partitions but I'm getting the same problem regardless of the number of consumers or partition I consume from, so let's keep it simple). The consumer application does the following:

  • Create a configuration an set "bootstrap.servers", "group.id", "enable.auto.commit" (to "false" -- I tried changing this and it doesn't affect the result), and "auto.offset.reset" (to "earliest" -- I tried not setting this and it doesn't affect the result).
  • Create a consumer instance using rd_kafka_new(RD_KAFKA_CONSUMER, conf, nullptr, 0);
  • Call rd_kafka_poll_set_consumer(consumer);
  • Use rd_kafka_topic_partition_list_new, rd_kafka_topic_partition_list_add, and rd_kafka_assign to assign partition 0 to the consumer;
  • Start a loop of rd_kafka_message_t *msg = rd_kafka_consumer_poll(consumer, 100);

This works fine when I have setup the topic with 1, 2, or 3 partitions, but when I try with 4 partitions, the first message received has its "err" field set to this error message:

Failed to fetch committed offsets for 0 partition(s) in group "my_consumer": Broker: Not coordinator

Then all subsequent calls to rd_kafka_consumer_poll return NULL.

Here is what I tried so far:

  • Using kafka-console-consumer.sh --bootstrap-server [...] --topic my_topic --partition 0 --offset earliest works fine, so I can consume from my topic's partition 0 using Kafka's kafka-console-consumer.sh;
  • Since kafka-console-consumer.sh was telling me I cannot specify both a --group and a --partition, I tried removing the "group.id" from the config in my C++ code, but I'm getting an error in the messages saying that the group ID hasn't been specified;
  • I also tried not calling rd_kafka_assign, to see if the consumer would get assigned to any partition(s) by default, but I'm not receiving any message (rd_kafka_consumer_poll returns NULL).

Again, the code works fine with 1, 2, and 3 partitions, but not with 4. Any idea what can be happening?

mdorier avatar Feb 07 '25 11:02 mdorier