confluent-kafka-dotnet icon indicating copy to clipboard operation
confluent-kafka-dotnet copied to clipboard

'High' CPU Usage when Consumers/Producers have an invalid topic name with no explicit error

Open Insomniak47 opened this issue 3 years ago • 2 comments

Description

Version: Confluent.Kafka 1.7.0

Context: We're running a number of consumer threads on lean containers on k8s with registered retry topics. One of our services had a space in the retry topic's name. When no events were being processed or created with ~30 topics registered we were using >700 millicores. I noticed the space and corrected it which brought the idle usage down to zero. I've reproduced it locally (though my computer is much better specced than the ec2 boxes we're using for EKS) so it doesn't seem like a ton locally.

With 20 Producers and 28 Consumers my Desktop (24 vcores) was seeing 7% utilization with no events being processed vs 0% when the space was removed. I'm not 100% on the confounding factors since a minimal test app it's between 1-4% in release mode.

This is only really a problem because the only error that presents is topic does not exist (on publish) though I figure it might be due to the cost of the error handling loop as well. Just wanted to bring it to your attention.

How to reproduce

Create a number of consumers with topics with a space in the name. Observe the difference in CPU usage

Neither has space (lots of consumers/producers): CONSUMER_FALSE_PRODUCER_FALSE

Consumer only has space: CONSUMER_TRUE_PRODUCER_FALSE

Both have space CONSUMER_TRUE_PRODUCER_TRUE

Checklist

Please provide the following information:

  • [ ] A complete (i.e. we can run it), minimal program demonstrating the problem. No need to supply a project file.
  • [x] Confluent.Kafka nuget version.
  • [ ] Apache Kafka version.
  • [x] Client configuration.
  • [ ] Operating system.
  • [ ] Provide logs (with "debug" : "..." as necessary in configuration).
  • [ ] Provide broker log excerpts.
  • [ ] Critical issue.

Insomniak47 avatar Sep 21 '21 14:09 Insomniak47

Can you reproduce this with Debug: "all" and provide the logs?

edenhill avatar Sep 21 '21 19:09 edenhill

@Insomniak47 This has been inactive for a while, not sure if later versions of the library have helped.

nhaq-confluent avatar Mar 12 '24 11:03 nhaq-confluent