charts icon indicating copy to clipboard operation
charts copied to clipboard

Wrong template variable in some prometheus cluster rules

Open Wain13 opened this issue 1 year ago • 2 comments

https://github.com/cloudnative-pg/charts/blob/fd5eff94b986797100155ad4638555cda3fb5823/charts/cluster/prometheus_rules/cluster-offline.yaml#L7

https://github.com/cloudnative-pg/charts/blob/fd5eff94b986797100155ad4638555cda3fb5823/charts/cluster/prometheus_rules/cluster-ha-critical.yaml#L7

https://github.com/cloudnative-pg/charts/blob/fd5eff94b986797100155ad4638555cda3fb5823/charts/cluster/prometheus_rules/cluster-ha-warning.yaml#L7

All of the above reference the cluster name incorrectly by using {{ $labels.job }}, causing them to not expand in the file, which then render as blank values when the alert is thrown. They will expand correctly if changed to {{ .namespace }}/{{ .cluster }} in accordance with the other prom rules.

Wain13 avatar Jun 21 '24 22:06 Wain13

That's odd, because the .labels is provided from here:

https://github.com/cloudnative-pg/charts/blob/fd5eff94b986797100155ad4638555cda3fb5823/charts/cluster/templates/prometheus-rule.yaml#L11-L29

itay-grudev avatar Jul 30 '24 10:07 itay-grudev

I think the problem is just with the CNPGClusterOffline query:

The count() aggregation here doesn't return any of the labels from the underlying cnpg_collector_up metric. Which is why there are no labels at the end in the alert description. The rest of the alerts are fine.

itay-grudev avatar Jul 30 '24 10:07 itay-grudev

Hi, @Wain13. I'm Dosu, and I'm helping the charts team manage their backlog. I'm marking this issue as stale.

Issue Summary

  • The issue involves incorrect template variables in Prometheus cluster rules.
  • {{ $labels.job }} is used instead of {{ .namespace }}/{{ .cluster }}, causing blank alert values.
  • @itay-grudev identified the issue with the CNPGClusterOffline query.
  • The count() aggregation does not return labels from the cnpg_collector_up metric.

Next Steps

  • Please confirm if this issue is still relevant to the latest version of the charts repository by commenting here.
  • If there is no further activity, the issue will be automatically closed in 7 days.

Thank you for your understanding and contribution!

dosubot[bot] avatar Apr 09 '25 16:04 dosubot[bot]