strimzi-kafka-operator icon indicating copy to clipboard operation
strimzi-kafka-operator copied to clipboard

[Bug]: Kafka Exporter Grafana dashboard too long URL error

Open jesusmah opened this issue 6 months ago • 1 comments

Bug Description

When there are few topics and few consumer groups, the Kafka Exporter Grafana Dashboard does not work because the API call being executed behind the panels is way too long:

image

The reason for this is that if you select the option "All" for consumer groups and topics it does include all of those as parameters in the URL:

image

Instead, the regular expression behind those panels should be something like ".*" so that it does not explicitly write all consumer groups and topics as parameters in the URL.

The workaround I managed to put together was to duplicate the dashboard and remove consumer group and topic as variables for the dashboard as well as removed these from the reg expressions behind the panels. The absence of those parameters means that the API call behind will retrieve the data for all consumer groups and topics. However, this implies having a duplicate dashboard for showing all consumer groups and topics.

Steps to reproduce

  1. Import the Kafka Exporter Grafana for a strimzi cluster that contains around 30 topics and 30 consumer groups.
  2. Select "All" for consumer groups and topics variables at the top of the dashboard (I think this is the default)
  3. See the errors in the dashboard.

Expected behavior

I would expect the dashboard to work fine and display the data it is meant to regardless of the number of consumer groups and topics just like the dashboard I managed to create as a workaround.

Strimzi version

0.36.1

Kubernetes version

OpenShift 4.12.18

Installation method

Operator

Infrastructure

VMware

Configuration files and logs

No response

Additional context

No response

jesusmah avatar Jan 10 '24 09:01 jesusmah

Triaged on a community call on 1.11.2024: This would need to be investigated more and we would need to try to find a better solution than having two dashboards.

scholzj avatar Jan 11 '24 08:01 scholzj

I can work on it if help is still needed

maciej-tatarski avatar Mar 12 '24 09:03 maciej-tatarski

If you need help with Grafana, @maciej-tatarski is the man 💯

daniel-orlov avatar Mar 12 '24 09:03 daniel-orlov

I checked it with grafana v10.3.3 and prometheus v2.50.1 with 1000+ topics and issue doesn't seem to appear: image

I would suggest changing default value for all to .* instead of blank if this is still appearing, like: image.

What version of components (grafana, prometheus) are you using @jesusmah?

Should I try with older grafana version? I also think we should bring all the dashboard up to date as there is plenty of deprecated graphs types in use.

maciej-tatarski avatar Mar 12 '24 13:03 maciej-tatarski

I also think we should bring all the dashboard up to date as there is plenty of deprecated graphs types in use.

I think we want to make sure the dashboards work with the Grafana versions with permissive open-source licenses.

scholzj avatar Mar 12 '24 14:03 scholzj

You mean pre AGPL?

maciej-tatarski avatar Mar 12 '24 14:03 maciej-tatarski

Yes. I think that was Grafana 7? 8? Something like that.

scholzj avatar Mar 12 '24 14:03 scholzj

Sure I'll test it tomorrow with grafana 8 (https://grafana.com/licensing/)

maciej-tatarski avatar Mar 12 '24 14:03 maciej-tatarski

I guess the link suggests 8 already is AGPL -> so we would ideally keep things work with 7.

scholzj avatar Mar 12 '24 14:03 scholzj

@scholzj I tested it with grafana 7.5.7 and i think the most straight forward fix is to just change allValue to .* as mentioned in the PR https://github.com/strimzi/strimzi-kafka-operator/pull/9817

maciej-tatarski avatar Mar 12 '24 14:03 maciej-tatarski

That sounds good. I can also confirm that it seems to work in my environment as well.

scholzj avatar Mar 12 '24 15:03 scholzj

Grafana 7.5.12 and Prometheus 2.39.1. Tks

jesusmah avatar Mar 12 '24 17:03 jesusmah