prometheus-operator
prometheus-operator copied to clipboard
Be able to define role selectors in ServiceMonitor and PodMonitor
What is missing?
Parameter to define selectors config in generated kubernetes_sd_configs.
Documentation here. https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config
# Optional label and field selectors to limit the discovery process to a subset of available resources.
# See https://kubernetes.io/docs/concepts/overview/working-with-objects/field-selectors/
# and https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/ to learn more about the possible
# filters that can be used. The endpoints role supports pod, service and endpoints selectors.
# The pod role supports node selectors when configured with `attach_metadata: {node: true}`.
# Other roles only support selectors matching the role itself (e.g. node role can only contain node selectors).
# Note: When making decision about using field/label selector make sure that this
# is the best approach - it will prevent Prometheus from reusing single list/watch
# for all scrape configs. This might result in a bigger load on the Kubernetes API,
# because per each selector combination there will be additional LIST/WATCH. On the other hand,
# if you just want to monitor small subset of pods in large cluster it's recommended to use selectors.
# Decision, if selectors should be used or not depends on the particular situation.
[ selectors:
[ - role: [<string>](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#string)
[ label: [<string>](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#string) ]
[ field: [<string>](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#string) ] ]]
Im willing to define them for ServiceMonitor CRDs, but probably applies in more use cases.
Why do we need it?
Prometheus allows it
Hello, @torrescd I'm not sure I fully understand your vision for adding kubernetes_sd_configs to ServiceMonitor. Could you please provide a bit more detail on your use case?
Also I'm not sure if you are aware but we are now implementing the ScrapeConfig CR that it might fit your use case https://github.com/prometheus-operator/prometheus-operator/pull/5335 based on the design doc https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/proposals/202212-scrape-config.md
Hi Joao, thank you for your answer.
I think that ScrapeConfig it is not needed in my use case.
See here this function.
https://github.com/prometheus-operator/prometheus-operator/blob/e93e2ed39a029ec056019c73d34dce0bf252343c/pkg/prometheus/promcfg.go#L1109
Here in this line we are setting and configuring kubernetes_sd_configs for ServiceMonitor CRD
https://github.com/prometheus-operator/prometheus-operator/blob/e93e2ed39a029ec056019c73d34dce0bf252343c/pkg/prometheus/promcfg.go#L1145
There we can configure namespaces and apiServer, but not selectors.
Regards!
@torrescd it would help if you can describe your use case with more details :)
Yes!
Basically we want to avoid to have dropped targets.
If you have several endpoints (or several ingresses) and you only want to target one of them, would be nice to filter them at "discovery time".
I plan to make a PR to make my point clear.
You should be able to achieve this with the relabelings field?
Hi! I won't be able to make it for a PR, but with this hardcoded change, I am able to discover only the ingress/enpoints with the desired label, I don't see how to achieve that with relabelings.
IIUC when you scrape the targets of a Service, they will have the labels of the k8s resource, and then you can configure relabelings to only keep the metrics that come from targets that have labels you want. But I'll let @simonpasquier intervene if I'm missing something
I have tested relabeling on the ServiceMonitor, and they seem to work at ingestion time, not at discovery time (targets still appear as undefined)
I am attaching the slack post and the image of the desired setup.
Regards!

spec.relabelings in the service monitor definition is your friend. Adapted from the latest version of https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/user-guides/running-exporters.md#relabeling
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: example-app
labels:
team: frontend
spec:
selector:
matchLabels:
app: example-app
endpoints:
- port: web
relabelings:
# keep only targets for which the endpoint and pod objects have a "metrics" label
- sourceLabels:
- __meta_kubernetes_pod_labelpresent_metrics
- __meta_kubernetes_endpoints_labelpresent_metrics
regex: "true,true"
action: keep
Hi @simonpasquier, thank you for your answer.
As I said before relabelings is only helping me to select desired targets, but I am not avoiding the discover of others (I still have, undefined/dropped targets which is what I want to avoid)
Regards!
I'm not sure to understand why seeing dropped targets isn't desired.
At the first comment there is some rational in the documentation.
On the other hand, if you just want to monitor small subset of pods in large cluster it's recommended to use selectors.
I'm not disagreeing with the feature request but the comment mostly applies if you have a big number of pods in a namespace and/or a service monitor that spans many namespaces. In most cases, I don't expect to see big perf/usage differences with and without selectors (but I can be wrong of course!).
/assign
RBAC is one consideration here. If Prometheus only has permission to see the targets it should be scraping, but tries to enumerate all of them then drop them via label rewrite, it'll spam errors to the log.
Today I am looking at a Prometheus with 125 PodMonitor / ServiceMonitors defined, about 8000 targets in total, which is regularly using 60 cores to label and drop targets. So using selectors to reduce that load would be a great enhancement.
It seems that #6580 is adding new config; I was expecting an option to use the existing selectors.
Fixed in https://github.com/prometheus-operator/prometheus-operator/pull/7086 and https://github.com/prometheus-operator/prometheus-operator/pull/7185