jaeger
jaeger copied to clipboard
[Bug]: The kafka partition lag doesn't grow in a distributed manner while collecting traces from collector and before sending it to ingester
What happened?
I have Jaeger collector sending traces to ingester via a Kafka queue with 3 partitions, each consumed by a replica of Jaeger ingester. Whenever I generate load on the system I see that the Kafka consumer lag grows which is expected. The problem I am facing is that a partition's lag grows more exponentially than others.
Attaching the image of the lag trends of each portion below
Steps to reproduce
- Send traces from jaeger collector to ingester via kafka.
- Make sure you have multiple partitions for the topic which is consumed by ingester in Kafka.
- Monitor the lag using Kafka exporter.
Expected behavior
I expect the lag to grow uniformly across partitions instead of it being unevenly distributed.
Relevant log output
No response
Screenshot
Additional context
No response
Jaeger backend version
v1.57.0
SDK
No response
Pipeline
OTEL Collector --> Jaeger Collector --> Kafka --> Jaeger Ingester --> Clickhouse
Stogage backend
Clickhouse
Operating system
Linux
Deployment model
Kubernetes
Deployment configs
No response