redpanda
redpanda copied to clipboard
grafana dashboards: don't require namespace to be "kafka"
There's nothing else in the dashboards requiring that metric to have this label.
This broke that part of the dashboard for a redpanda cluster we deployed in a namespace != "kafka".
Initially introduced in 68e21f3edd2d48f07030bf7f4680c1572ae0ada4.
is there any concern that this will now count all partitions in the cluster including those from internal storage partitions?
Namespace is the k8s namespace, no?
Namespace is the k8s namespace, no?
I don't think so. I think the kafka namespace being changed in this PR is the namespace we use internally in our prometheus metrics labels. @twmb @0x5d am I way off base here?
Namespace is the k8s namespace, no?
I don't think so. I think the
kafkanamespace being changed in this PR is the namespace we use internally in our prometheus metrics labels. @twmb @0x5d am I way off base here?
@dotnwat is right, this is an internal label. We could look into changing the label's name if it causes friction with kubernetes deployments.
That said, I guess the quickest solution is to rename the label in the prometheus scrape config to something different ( or changing the k8s namespace label, but I'm sure that is used more in most clusters :) )
This broke that part of the dashboard for a redpanda cluster we deployed in a namespace != "kafka".
@flokli can you provide some detail on how it broke and why this fixes it?
That said, I guess the quickest solution is to rename the label in the prometheus scrape config to something different ( or changing the k8s namespace label, but I'm sure that is used more in most clusters :) )
Yeah, I'd say namespace is pretty much a reserved word, and in k8s environments used for the namespace of the pod that's being scraped - at least when using grafana-agent-operator and prometheus-operator. From the config there:
relabel_configs:
- source_labels:
- job
target_label: __tmp_prometheus_job_name
- action: keep
regex: app-foo
source_labels:
- __meta_kubernetes_pod_label_name
- action: keep
regex: http
source_labels:
- __meta_kubernetes_pod_container_port_name
- source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- source_labels:
- __meta_kubernetes_service_name
target_label: service
- source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- regex: (.+)
replacement: $1
source_labels:
- __meta_kubernetes_pod_label_app
target_label: app
- regex: (.+)
replacement: $1
source_labels:
- __meta_kubernetes_pod_label_name
target_label: name
- replacement: default/app-foo-metrics
target_label: job
- replacement: http
target_label: endpoint
- action: hashmod
modulus: 1
source_labels:
- __address__
target_label: __tmp_hash
- action: keep
regex: 0
source_labels:
- __tmp_hash
There's a lot of kubernetes-specific dashboards out there, also making use of that convention - so it's not very likely to change any time soon.
If the dashboard needs to filter vectorized_storage_log_partition_size to only get those metrics, maybe renaming that label inside redpanda to be something not clashing with these label names (kafka_namespace maybe) would be the right call?
Instead of changing it in the source code, we could use relabelling in the prometheus scrape config to rename it, yes, but given redpanda-operator doesn't install the scrape configs, only the other helm chart, and given there's also people running redpanda outside k8s that scrape redpanda metrics manually (who wouldn't be aware of any relabelling they need to do), that'd mean their dashboards would need to look different.
This should probably be changed in redpanda itself.
Thanks for the detail @flokli. this is definitely out of my area of expertise so I'll defer further to @0xdiba @twmb @0x5d. I guess my only concern is if we need to create a plan for upgrading deployed grafana dashboards after the namespace changes.
Any update?
Hey @flokli, we are discussing on what a migration could look like and how breaking a change would be for this.
Grafana dashboards are now maintained in the redpanda-data/observability repo