grafana-dashboards-kubernetes
grafana-dashboards-kubernetes copied to clipboard
Cluster variable
Thanks for very nice dashboards.
One thing missing is a variable "cluster" maybe. Having multiple clusters it is useful to limit scope to a single cluster. A multi-select variable accepting all and queries adding "cluster=~"$cluster"
.
Hi @tesharp, you can achieve this by using the Datasource variable; just configure each of your clusters as datasources.
Hi @dotdc , first of I really think this is great stuff. Came here from your Medium post, and I'm glad some of us show that FR people can do stuffs too :)
Anyway, even though you're right for Datasources, there is the use-case of people using a Prom federation, or who uses tools like Thanos (even Grafana Mimir but I haven't tried it) in order to achieve global monitoring of several clusters from the same point. In the case of Thanos - for which I'm familiar with - we get all the data in the same "bucket", therefore our Grafanas usually point to 1 DS to viz a several clusters.
I was about to open an issue before finding this one, I'm keen to propose a PR adding a cluster var and let you decide whether or not you want to integrate it, wdyt ?
Thank you @cebidhem!
I understand the need and also used the $cluster
variable in the past.
The problem is that the cluster
label is not available by default, I guess you have a custom rule that adds it.
If I add the cluster variable and label in my dashboards, they will not work without it, and will break the dashboards for most users.
I don't think you can create a PromQL query with an optional label, if any of you know a way of doing this, I'm really interested. If you have other ideas to solve this, we can discuss it here.
It makes completely sense. I'll probably fork the project and add the variable in our project, at least for now.
I'd be happy to include this feature if you make a tool or script that injects the necessary variables and labels to the dashboards.
This way, everyone will benefit from your work and it will also be easier for you to update.
We could use GitHub Actions to publish a set of new dashboards, maybe prefixed with multicluster-
, with a cluster
variable in the queries ?
That way, it would be possible to have both types of dashboards in the same repo, and even publish both version on Grafana. But at the very least, they would be here to be consumed by the Grafana configuration as raw.
I've never really work with Actions as I'm mostly using GitLab, but I could give it a try.
I guess a quite simple working solution could be a shell script doing some sed
.
That was the idea, if you create a tool that can inject the labels & variables in the script, I will make a special release for them.
@dotdc how about usage of jsonnet? i know that https://github.com/grafana/grafonnet-lib doesn't support grafana 9. but i can try to build dashboard generation on bare jsonnet, without grafonnet
I can prepare changes if you are fine with the approach
Hi @k1rk,
The only requirement I have is that the source must remain flat JSON Grafana dashboards. If you can provide a script, in any language, that can generate a copy of the dashboards and inject the missing cluster label in them, it would be the perfect solution to me.
Let me know.
Did anyone manage to script this? Would love to add the functionality for multi-cluster support.
Excellent dashboards btw, thanks!
I have same issue, too..
Hi @keith-e-munro,
I don't think anyone did it yet, but we can discuss the topic further if you're willing to do it.
I personally use one datasource per cluster, even when using Thanos, so I don't have to rely on the cluster variable.
Thank you @cebidhem! I understand the need and also used the
$cluster
variable in the past. The problem is that thecluster
label is not available by default, I guess you have a custom rule that adds it. If I add the cluster variable and label in my dashboards, they will not work without it, and will break the dashboards for most users.I don't think you can create a PromQL query with an optional label, if any of you know a way of doing this, I'm really interested. If you have other ideas to solve this, we can discuss it here.
First of all, very nice set of dashboards 👏 .
Not sure if I have all the context, but non-existing labels should not break the query, at least not in recent versions of Prometheus e.g. > 2.45.0 (could not find whether this was a fix at some point); so adding a variable with default empty value and a label like cluster="$cluster"
in the dashboard queries will work since the metric value is returned in case the label is not defined {cluster=""}
This is really interesting @mihaico, just tested on 2.46.0 and it works! I'm pretty sure it wasn't the case before, so I would be really interested to know when it changed, or if I just didn't see the elephant in the room :sweat_smile:
We could pick this solution if the change is old enough, otherwise I would wait a little because 2.45.0 will probably be a bit too recent for most users.
I am not sure whether this is related, but I am experiencing issue that dashboard picks up non-kubernetes nodes (in this case openwrt node-exporter). It also messes up metrics like CPU usage, averaging them together as it would be an k8s node.
It incorrectly shows other non-kubernetes node.
As I use kube-prometheus-stack I am adding the openwrt node-export with prom operator CRD. Could it be that it is picking it up as k8s node due it being defined as service endpoint?
kind: Service
apiVersion: v1
metadata:
name: openwrt-prometheus-metrics
labels:
app: openwrt-prometheus-metrics
spec:
type: ClusterIP
ports:
- name: metrics
port: 9100
targetPort: 9100
---
kind: Endpoints
apiVersion: v1
metadata:
name: openwrt-prometheus-metrics
subsets:
- addresses:
- ip: 10.1.20.1
ports:
- name: metrics
port: 9100
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: openwrt-servicemonitor
spec:
selector:
matchLabels:
app: openwrt-prometheus-metrics
endpoints:
- port: metrics
interval: 30s
path: /metrics
namespaceSelector:
any: true
Hi @AndrisJrs,
There is no differentiating label for Kubernetes nodes here. I would recommend running a dedicated Prometheus instance for your cluster.
In order to keep the issue on topic, please open a dedicated issue if you want to discuss it further.
@dotdc I retest this in Promtheus 2.0.0 (docker run --rm -ti -p9090:9090 prom/prometheus:v2.0.0
) and cluster="" still works there. From my point of view, this should unlock the Feature Request
Sorry for the delay, I know this has been asked for quite some time.
@jkroepke thank you for this input, I think it's safe enough to give it a shot. I'll convert the global view and share it beginning of next week so you can test and let me know if it works on your various setups.
Just added the cluster variable in k8s-views-global.json
in this PR : https://github.com/dotdc/grafana-dashboards-kubernetes/pull/78
If anyone has time to try it, and let me know if it works on your side.
I give some feedback on PR
Thank you @jkroepke, just made the according change :+1: Let me know if you notice anything else.
Just did the namespaces, nodes and pods views (https://github.com/dotdc/grafana-dashboards-kubernetes/pull/82). I'll wait for more feedback before doing the same for the remaining dashboards.
How to add additional label in Prometheus Operator? As "cluster" isn't default label.
How to add additional label in Prometheus Operator? As "cluster" isn't default label.
Hi @AndrisJrs, you can add this label using static_configs
or relabel_configs
, but only if you need it.
This addition was made for Thanos users (or similar) using a single datasource for multiple clusters (global queries).
All PromQL queries should work without this label, let me know if it's not the case on your setup.
Hi @doc,
I looked into #82. Since multi-value is dropped, the query can be optimized to cluster="$cluster"
instead cluster=~"$cluster"
. Not sure if it really has an impact.
True, made the change in #84
Hi @dotdc, Thanks for the tip. It works fine without label. I wanted to see whether it will fix non-kubernetes node-exporters leaking into dashboard like I reported earlier in this thread. Works perfectly after adding relabelings. Thank you 👍
Just merged https://github.com/dotdc/grafana-dashboards-kubernetes/pull/90 with the missing dashboards. :tada:
Please open a new issue if you see any bugs related to this.
Thank you all for your ideas !
:tada: This issue has been resolved in version 1.1.0 :tada:
The release is available on GitHub release
Your semantic-release bot :package::rocket: