loki icon indicating copy to clipboard operation
loki copied to clipboard

fix: loki-operational.libsonnet

Open QuentinBisson opened this issue 1 year ago • 6 comments

This PR fixes the loki-operational.libsonnet mixin in case the value of per_cluster_label starts with cluster_.

For historical reasons, we named the cluster label cluster_id so generation actually replaces all cluster_id with cluster_id_id in this dashboard because the replace is incorrect

What this PR does / why we need it:

This PR fixes the loki operational dashboard when the cluster label starts with cluster_

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Checklist

  • [ ] Reviewed the CONTRIBUTING.md guide (required)
  • [ ] Documentation added
  • [ ] Tests updated
  • [ ] Title matches the required conventional commits format, see here
  • [ ] Changes that require user attention or interaction to upgrade are documented in docs/sources/setup/upgrade/_index.md
  • [ ] For Helm chart changes bump the Helm chart version in production/helm/loki/Chart.yaml and update production/helm/loki/CHANGELOG.md and production/helm/loki/README.md. Example PR
  • [ ] If the change is deprecating or removing a configuration option, update the deprecated-config.yaml and deleted-config.yaml files respectively in the tools/deprecated-config-checker directory. Example PR

QuentinBisson avatar Apr 25 '24 13:04 QuentinBisson

@cstyan I would really appreciate the review :)

QuentinBisson avatar Apr 25 '24 13:04 QuentinBisson

@QuentinBisson sorry, I'm failing to visualize in my head what the new value of the label replace would end up inserting as the label, can you help me out here with a more concrete example of what happens currently vs what we should be doing?

cstyan avatar Apr 26 '24 18:04 cstyan

Sure, so the issue is that if we set the per_cluster_label now to cluster_id then all occurrences of cluster=$cluster are perfectly replaced to cluster_id=$cluster but when the englobing replace them again then this résults in cluster_id_id=$cluster. Limiting thé englobing to cluster_job only renames thé recording rules to cluster_id_job which is what this replace was for

QuentinBisson avatar Apr 26 '24 18:04 QuentinBisson

Sure, so the issue is that if we set the per_cluster_label now to cluster_id then all occurrences of cluster=$cluster are perfectly replaced to cluster_id=$cluster but when the englobing replace them again then this résults in cluster_id_id=$cluster.

Okay this makes sense, thanks :+1:

Limiting thé englobing to cluster_job only renames thé recording rules to cluster_id_job which is what this replace was for

can this not cause issues in other places though where we have just cluster? I supposed the earlier replace directives handle those cases?

cstyan avatar Apr 26 '24 20:04 cstyan

The previous one should be enough but i'll run it an test. From what I understood, there are 4 use cases:

  • cluster=$cluster
  • cluster=~$cluster
  • cluster in templating or by clauses which i'm or sure about
  • cluster_job recording rules. Let me generate this on monday to assert all cluster labels are fixed. This dashboard importing raw json is not the easiest

QuentinBisson avatar Apr 27 '24 10:04 QuentinBisson

@cstyan I checked and there is no other use cases for cluster apart from those 3 replace. The one used in the templating section is managed via the addCluster libsonnet function. We could keep the replace of cluster_ only and move it as the first replace but that would not really change much in the end

QuentinBisson avatar Apr 29 '24 10:04 QuentinBisson