nest icon indicating copy to clipboard operation
nest copied to clipboard

Concurrent messages consumption for different topics in Kafka microservice

Open sirmonin opened this issue 9 months ago • 3 comments
trafficstars

Is there an existing issue that is already proposing this?

  • [x] I have searched the existing issues

Is your feature request related to a problem? Please describe it

This has been previously discussed in https://github.com/nestjs/nest/issues/12703. Basically, kafkajs consumer calls eachMessage handler for all messages sequentially, even if the consumer is subscribed to multiple topics. Nest.js creates only one consumer in the microservice, therefore drastically limiting Kafka performance for message handlers that use async logic.

E.g. a consumer is subscribed to topic-A and topic-B. In topic-A message handler there is an async request that takes 1 second to resolve. In topic-B message handler there is no async logic and it resolves fast. If you have 1000 messages in each topic, it could happen that consumption of topic-A messages completely blocks consumption of topic-B messages, even though they're supposed to be independent.

Describe the solution you'd like

In kafkajs, you're supposed to create a consumer for each separate topic to ensure proper per-topic concurrency. Each created consumer should subscribed to a single topic. I suggest using a map of consumer instances instead of a single consumer instance.

Teachability, documentation, adoption, migration strategy

At the moment all logic is centered around that single consumer. So do the methods to work with it. The major change is that such methods as getConsumer() from the context will require a topic or correlationId argument, to be able to get the consumer from the map.

As an alternative, if each consumer will have its own context, getConsumer() can be simply bound to the particular consumer, not requiring any additional changes in API.

What is the motivation / use case for changing the behavior?

I have a project that heavily relies on Kafka. I have a lot of fast handlers for most of the topics, however a couple of handlers perform async operations, therefore are much slower. Fast handlers are getting blocked while slow handlers simply wait for response.

I saw that there were plans to migrate to node-rdkafka. However, the problem is that kafkajs still has much more features than any librdkafka client. I use kafkajs administration features, in particular partition reassignment, which is not available in other kafka client libraries. Therefore it is too early to give up on kafkajs, there is simply no match for it.

sirmonin avatar Feb 09 '25 18:02 sirmonin

You can concurrent read topics with scheme where is M consumers has N topics

Sample:

https://github.com/nestjs/nest/issues/11298#issuecomment-1513409737

This is not beautiful solution, but i think it is one for u right now

maxbronnikov10 avatar Feb 10 '25 22:02 maxbronnikov10

Hi, @kamilmysliwiec!

I’d like to understand if this issue is acknowledged as a problem that needs to be addressed. Do you plan to look into it yourself?

I’m open to investigating and working on a fix, but I’d rather not spend time on it if it’s likely to be considered irrelevant or won’t be accepted. Let me know your thoughts!

sirmonin avatar Feb 25 '25 09:02 sirmonin

Hi, @sirmonin

I think you’ll find this comment useful as a reference:

https://github.com/nestjs/nest/issues/12703#issuecomment-2272861098

gkdis6 avatar Mar 07 '25 14:03 gkdis6