cortex-tools cortextool prepare produces invalid query

When running cortextool prepare on a query such as:

(sum by (node, resource) (kube_node_status_capacity{}))
          * on(node) group_left(cluster, nodepool) nodepool:node:{} < %(threshold)0.2f
      )

I'll produce

(sum by (node, resource, cluster) (kube_node_status_capacity{}))
          * on(node,cluster) group_left(cluster, nodepool) nodepool:node:{} < %(threshold)0.2f
      )

But this is invalid given: could not parse expression: 1:301: parse error: label \"cluster\" must not occur in ON and GROUP clause at once

Jul 01 '21 10:07 gotjosh

I had another instance of this. The follow query intends to derive the cluster label form the kube_node_annotations query. The grouping should not occur on the first part of the query.

Intended query:

100 *
sum by (instance_id, nat_gateway_name, project_id) (
  stackdriver_gce_instance_compute_googleapis_com_nat_port_usage
) /
sum by (instance_id, nat_gateway_name, project_id) (
  stackdriver_gce_instance_compute_googleapis_com_nat_allocated_ports
)
* on(instance_id) group_left(node, cluster)
count by (instance_id, node, cluster) (
  label_replace(
   kube_node_annotations{annotation_container_googleapis_com_instance_id!=""},
   'instance_id', '$1',
   'annotation_container_googleapis_com_instance_id', '(.*)'
  )
)  > 90

This forced me to remove the cluster label from group_left(), eventually rendering a wrong query:


100 *
sum by(instance_id, nat_gateway_name, project_id, cluster) (
  stackdriver_gce_instance_compute_googleapis_com_nat_port_usage
) /
sum by(instance_id, nat_gateway_name, project_id, cluster) (
  stackdriver_gce_instance_compute_googleapis_com_nat_allocated_ports
)
* on(instance_id, cluster) group_left(node)
count by(instance_id, node, cluster) (
  label_replace(
   kube_node_annotations{annotation_container_googleapis_com_instance_id!=""},
   "instance_id", "$1",
   "annotation_container_googleapis_com_instance_id", "(.*)"
  )
)  > 90

Mar 04 '22 10:03 Duologic

Perhaps we can have a HeadComment (for ex: # cortextool: skip rule aggregation to indicate the aggregation should not be applied.

wdyt?

Mar 04 '22 10:03 Duologic

Elaborating on @Duologic 's comments. The query is still wrong for us as the on(instance_id, **cluster**) part is causing issues - the metrics were generated in different clusters, therefore we don't want to join on the label.

Apr 13 '22 11:04 dohnto

cortex-tools cortex-tools copied to clipboard

cortextool prepare produces invalid query

cortex-tools
cortex-tools copied to clipboard