strimzi-kafka-operator icon indicating copy to clipboard operation
strimzi-kafka-operator copied to clipboard

Kafka Exporter reporting some wrong values when no data traffic

Open ppatierno opened this issue 5 years ago • 7 comments

The current provided Kafka Exporter dashboard has its main fields based on a specific exporter metric, kafka_consumergroup_current_offset, which is not exposed by the exporter itself until there is no real data exchange between clients (it's not enough having a consumer joined to a consumer group, but a producer sending messages is needed as well). This means that when the user takes a look at the dashboard when no data are exchanged, the dashboard looks like the following one:

Screenshot-20200302154210-1904x970

It means that, even if you have topics (so partitions), but no data is exchanged between clients, the values are still 0 which doesn't reflect the real status of the cluster (same for replicas, insync replicas, ...). I guess that this dashboard was provided in order to work on its own without relying on metrics exported by the Kafka brokers. So it means that even if you create a Kafka resource without metrics section you can get some metrics from the Kafka exporter dashboard anyway. Was it done on purpose? I think that we should rely on the integration with Kafka metrics so using some metrics from brokers to fill fields like topics, replicas and so on in order to avoid this 0 values in the exporter dashboard. Of course it means that the exporter dashboard works fine only if you deploy the Kafka resource with metrics enabled.

ppatierno avatar Mar 02 '20 16:03 ppatierno

@scholzj @tombentley @stanlyDoge wdyt?

ppatierno avatar Mar 02 '20 16:03 ppatierno

I do not think this should be changed. This shows the view with regards to consumers. If you haven't consumed a single message with given consumer group from given partition, I do not think it needs to care about it. It also IIRC provides additional information which is not available from Kafka metrics it self, which makes the top row valuable and why it was added in the first place. So you can either mix the information from different sources whihc is a bad idea, or remove it completeyl and thus loose some important information.

Also, from my experience, this has nothing to do with no data being exchanged at given moment. You just haven't consumed a single message for a long enough time to make your __consumer_offsets topic empty. So it is not like this is how it looks when your comsumer crashes and doesn't consume data. This is the state when you basically haven't consumed a single message.

scholzj avatar Mar 02 '20 16:03 scholzj

But if you have 10 topics in your cluster with some partitions and you go to the Kafka Exporter dashboard you see 0 topic, 0 partitions which is actually wrong. Those values are populated only when you start to exchange data so that the metric kafka_consumergroup_current_offset is exposed (exporter doesn't expose such a metrics until any data exchange). Of course the values stay there even when you don't consume data for long time but being showed there just happens only when you start exchange data. Tbh I see inconsistency, I know that I have 10 topics in the cluster and for example 40 partitions in total but that dashboard shows 0.

ppatierno avatar Mar 02 '20 16:03 ppatierno

Triaged on 2.8.2022: @ppatierno will recheck if this issue still exists.

scholzj avatar Aug 02 '22 15:08 scholzj

Just tested with Strimzi 0.30.0 and the "issue" is still there. My point is still the same as the latest comment, about inconsistency. You know to have topics on the cluster but maybe because there are no data exchanged, the dashboard shows 0 because of how/when Kafka exporter exposes the value through the metrics. Anyway it seems how kafka exporter works, not exposing metrics until some data are in the flow.

ppatierno avatar Aug 04 '22 09:08 ppatierno

@ppatierno How much does this have in common with #6137?

scholzj avatar Aug 04 '22 11:08 scholzj

I didn't see any timeout tbh when having a topic created but no __consumer_offset topic but for sure I didn't see any metrics (so blank dashboard). Maybe we could consider them related in the sense that without the __consumer_offset in place, Kafka exporter seems to be "unstable".

ppatierno avatar Aug 04 '22 12:08 ppatierno

Triaged on 18.8.2022:

  • We should improve the docs to note the Kafka Exporter behavior. @PaulRMellor and @ppatierno will have a look at it.
  • We should check if the Grafana dashboard can be modified to not show 0 when there are no metrics but show something like n/a. @mimaison and @ppatierno will have a look.

scholzj avatar Aug 18 '22 14:08 scholzj