prometheus-operator icon indicating copy to clipboard operation
prometheus-operator copied to clipboard

Be able to define role selectors in ServiceMonitor and PodMonitor

Open torrescd opened this issue 2 years ago • 15 comments

What is missing?

Parameter to define selectors config in generated kubernetes_sd_configs.

Documentation here. https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config


# Optional label and field selectors to limit the discovery process to a subset of available resources.
# See https://kubernetes.io/docs/concepts/overview/working-with-objects/field-selectors/
# and https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/ to learn more about the possible
# filters that can be used. The endpoints role supports pod, service and endpoints selectors.
# The pod role supports node selectors when configured with `attach_metadata: {node: true}`.
# Other roles only support selectors matching the role itself (e.g. node role can only contain node selectors).

# Note: When making decision about using field/label selector make sure that this
# is the best approach - it will prevent Prometheus from reusing single list/watch
# for all scrape configs. This might result in a bigger load on the Kubernetes API,
# because per each selector combination there will be additional LIST/WATCH. On the other hand,
# if you just want to monitor small subset of pods in large cluster it's recommended to use selectors.
# Decision, if selectors should be used or not depends on the particular situation.
[ selectors:
  [ - role: [<string>](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#string)
    [ label: [<string>](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#string) ]
    [ field: [<string>](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#string) ] ]]

Im willing to define them for ServiceMonitor CRDs, but probably applies in more use cases.

Why do we need it?

Prometheus allows it

torrescd avatar Feb 16 '23 16:02 torrescd

Hello, @torrescd I'm not sure I fully understand your vision for adding kubernetes_sd_configs to ServiceMonitor. Could you please provide a bit more detail on your use case?

Also I'm not sure if you are aware but we are now implementing the ScrapeConfig CR that it might fit your use case https://github.com/prometheus-operator/prometheus-operator/pull/5335 based on the design doc https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/proposals/202212-scrape-config.md

JoaoBraveCoding avatar Feb 16 '23 17:02 JoaoBraveCoding

Hi Joao, thank you for your answer.

I think that ScrapeConfig it is not needed in my use case.

See here this function.

https://github.com/prometheus-operator/prometheus-operator/blob/e93e2ed39a029ec056019c73d34dce0bf252343c/pkg/prometheus/promcfg.go#L1109

Here in this line we are setting and configuring kubernetes_sd_configs for ServiceMonitor CRD

https://github.com/prometheus-operator/prometheus-operator/blob/e93e2ed39a029ec056019c73d34dce0bf252343c/pkg/prometheus/promcfg.go#L1145

There we can configure namespaces and apiServer, but not selectors.

Regards!

torrescd avatar Feb 16 '23 18:02 torrescd

@torrescd it would help if you can describe your use case with more details :)

simonpasquier avatar Feb 17 '23 13:02 simonpasquier

Yes!

Basically we want to avoid to have dropped targets.

If you have several endpoints (or several ingresses) and you only want to target one of them, would be nice to filter them at "discovery time".

I plan to make a PR to make my point clear.

torrescd avatar Feb 17 '23 13:02 torrescd

You should be able to achieve this with the relabelings field?

simonpasquier avatar Feb 17 '23 15:02 simonpasquier

Hi! I won't be able to make it for a PR, but with this hardcoded change, I am able to discover only the ingress/enpoints with the desired label, I don't see how to achieve that with relabelings.

torrescd avatar Feb 18 '23 11:02 torrescd

IIUC when you scrape the targets of a Service, they will have the labels of the k8s resource, and then you can configure relabelings to only keep the metrics that come from targets that have labels you want. But I'll let @simonpasquier intervene if I'm missing something

JoaoBraveCoding avatar Feb 20 '23 11:02 JoaoBraveCoding

I have tested relabeling on the ServiceMonitor, and they seem to work at ingestion time, not at discovery time (targets still appear as undefined)

I am attaching the slack post and the image of the desired setup.

Regards!

image

torrescd avatar Feb 20 '23 12:02 torrescd

spec.relabelings in the service monitor definition is your friend. Adapted from the latest version of https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/user-guides/running-exporters.md#relabeling

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: example-app
  labels:
    team: frontend
spec:
  selector:
    matchLabels:
      app: example-app
  endpoints:
  - port: web
    relabelings:
      # keep only targets for which the endpoint and pod objects have a "metrics" label
      - sourceLabels:
          - __meta_kubernetes_pod_labelpresent_metrics
          - __meta_kubernetes_endpoints_labelpresent_metrics
        regex: "true,true"
        action: keep

simonpasquier avatar Feb 20 '23 13:02 simonpasquier

Hi @simonpasquier, thank you for your answer.

As I said before relabelings is only helping me to select desired targets, but I am not avoiding the discover of others (I still have, undefined/dropped targets which is what I want to avoid)

Regards!

torrescd avatar Feb 20 '23 13:02 torrescd

I'm not sure to understand why seeing dropped targets isn't desired.

simonpasquier avatar Feb 20 '23 14:02 simonpasquier

At the first comment there is some rational in the documentation.

On the other hand, if you just want to monitor small subset of pods in large cluster it's recommended to use selectors.

torrescd avatar Feb 20 '23 14:02 torrescd

I'm not disagreeing with the feature request but the comment mostly applies if you have a big number of pods in a namespace and/or a service monitor that spans many namespaces. In most cases, I don't expect to see big perf/usage differences with and without selectors (but I can be wrong of course!).

simonpasquier avatar Feb 20 '23 14:02 simonpasquier

/assign

yp969803 avatar May 12 '24 17:05 yp969803

RBAC is one consideration here. If Prometheus only has permission to see the targets it should be scraping, but tries to enumerate all of them then drop them via label rewrite, it'll spam errors to the log.

ringerc avatar Aug 07 '24 03:08 ringerc

Today I am looking at a Prometheus with 125 PodMonitor / ServiceMonitors defined, about 8000 targets in total, which is regularly using 60 cores to label and drop targets. So using selectors to reduce that load would be a great enhancement.

It seems that #6580 is adding new config; I was expecting an option to use the existing selectors.

bboreham avatar Nov 01 '24 10:11 bboreham

Fixed in https://github.com/prometheus-operator/prometheus-operator/pull/7086 and https://github.com/prometheus-operator/prometheus-operator/pull/7185

slashpai avatar Feb 06 '25 16:02 slashpai