prometheus scrape body_size_limit in ServiceMonitor.
Component(s)
ServiceMonitor
What is missing? Please describe.
In Prometheus-operator, there is no direct support for setting the HTTP request body_size_limit in ServiceMonitor configurations.
only support Global Setting:
- enforcedBodySizeLimit or
- bodySizeLimit
Is there a possibility to set constraints within the ServiceMonitor configuration?
Describe alternatives you've considered.
thanks !
Environment Information.
Environment
Kubernetes Version: Prometheus-Operator Version: kube-prometheus main
Currently we have option for specifying bodySizeLimit only in Prometheus CR. I recollected back the PR which introduced this field https://github.com/prometheus-operator/prometheus-operator/pull/4275#discussion_r713737514, it was not added because of lesser use case.
Do you have any particular requirement to have this in ServiceMonitor?
Yes , In the my monitoring system, there are over 200 services that require metric collection. Currently, a shard approach is employed for collection. however, practical challenges have emerged.
1: Metrics Overload Causing Prometheus OOM: Certain services exhibit a high volume of metrics, potentially leading to Prometheus running out of memory and triggering OOM issues.
2: OOM Due to Service Anomalies or Traffic Surges: Some services, when encountering bugs or facing sudden spikes in traffic, may also contribute to Prometheus running out of memory.
3: ServiceMonitor Body Size Limitation: Certain k8s services, such as kube-apiserver, may have large self-metrics. The global enforcedBodySizeLimit is set to 20MB, which may result in the failure of collecting certain k8s self-metrics.
Considering the issues mentioned. We aim to customize job configurations for each ServiceMonitor, allowing for the inclusion of the body_size_limit parameter. While we acknowledge the initial concern of setting a standard limit to 10MB, we also recognize the practical challenges associated with certain services like kube-apiserver or RD service. Therefore, a tailored approach to adjust the body size limit on a per-ServiceMonitor basis would address these concerns effectively.
I'm sorry, perhaps I'm a newbie and not familiar with the usage.
@zhxiaohe thanks for the detailed answer. I think that it makes sense to add body size limit to service and pod monitors. Have you tried setting sample limits too? If no, would it help? My feeling is that sample limits are easier to reason about than the body size limit.
@simonpasquier @slashpai Thank you very much! I Have tried setting sample limits, and estimated number of samples is approximately equivalent to MBytes.
I'm willing to work on this issue
Merging is blocked 😞, ServiceMonitor BodySizeLimit priority > enforcedBodySizeLimit, Can I also try merging some code?
excuse me, I will update it tonight