rabbitmq-server icon indicating copy to clipboard operation
rabbitmq-server copied to clipboard

Update RabbitMQ Dashboards to support latest Grafana versions

Open coro opened this issue 3 years ago • 0 comments

Proposed Changes

Grafana unofficially supports n-1 major versions, with the latest being, at time of writing, 9.0.6. This PR does the following:

  • Updates minimum Grafana versions on the dashboards to 8.3.4
    • This is not 8.0.0 because there were a handful of bugs which prevented some fields working
  • Migrates any dashboard panels on deprecated types to their new types
    • Graph (old) -> Time Series or Histogram (beta)
    • Table (old) -> Table
  • Fixes the distribution link graphs to work in Grafana 8&9
    • There was a bug in how the y-axis was rendered, meaning the entries for a peer were conflated
    • e.g. A->B and C->B were showing in the same row, leading to half the true amount of links
  • Fixes the RabbitMQ-Overview graph not displaying correctly on Grafana 9
    • Several graphs had the decimal option for the y-axis set to -1, which causes console errors in Grafana 9
  • Vastly simplified the process of updating the dashboards
    • The dashboards stored in this repo now match that which we upload to Grafana 1-1, rather than requiring us to inject fields like __inputs unnecessarily
    • The dashboards are now directly exported from Grafana with their Export functionality
    • Also added a README.md with a proposed process of how future updates work
  • Tweaked the 'latency distribution' graphs to use histogram formats instead
    • This improves their readability compared with the old graph style

Types of Changes

What types of changes does your code introduce to this project? Put an x in the boxes that apply

  • [ ] Bug fix (non-breaking change which fixes issue #NNNN)
  • [ ] New feature (non-breaking change which adds functionality)
  • [ ] Breaking change (fix or feature that would cause an observable behavior change in existing systems)
  • [ ] Documentation improvements (corrections, new content, etc)
  • [x] Cosmetic change (whitespace, formatting, etc)
  • [ ] Build system and/or CI

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask on the mailing list. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

  • [ ] I have read the CONTRIBUTING.md document
  • [ ] I have signed the CA (see https://cla.pivotal.io/sign/rabbitmq)
  • [ ] I have added tests that prove my fix is effective or that my feature works
  • [ ] All tests pass locally with my changes
  • [ ] If relevant, I have added necessary documentation to https://github.com/rabbitmq/rabbitmq-website
  • [ ] If relevant, I have added this change to the first version(s) in release-notes that I expect to introduce it

Further Comments

I have tested this on Grafana versions 3.8.4 and 9.0.6, through both the make targets in the repo and on a seperate Kubernetes cluster running the kube-prometheus stack.

This is part of a wider change to update the dashboards in this issue. The next step is to migrate the metrics from aggregated metrics to the more accurate global metrics as per https://github.com/rabbitmq/rabbitmq-server/pull/3127#issuecomment-1153617314

coro avatar Aug 05 '22 16:08 coro

I think it is safe to merge this one. README script works, no errors on dashboards, they look clean and native grafana export produces exactly same json as in the repo (ignoring version bump of course).

ikavgo avatar Aug 24 '22 19:08 ikavgo

@mergifyio backport v3.11.x v3.10.x v3.9.x

michaelklishin avatar Aug 25 '22 01:08 michaelklishin

backport v3.11.x v3.10.x v3.9.x

✅ Backports have been created

mergify[bot] avatar Aug 25 '22 01:08 mergify[bot]