prometheus_exporter Add health check endpoint for the collector

hi,

we are using metrics collector as a centralized Kubernetes pod that receives metrics from all the application pods. As we have more metrics, the collector pod (metrics server) stops functioning properly and we get ruby_collector_working 0. We noticed the pod was getting CPU throttled and increased the resources for it but would it be possible to add a health check endpoint so that Kubernetes would detect it automatically and restart the pod through a liveness probe?

I saw there was a closed issue for the same feature (https://github.com/discourse/prometheus_exporter/issues/69) But wanted to raise it again as it seems to be a useful functionality.

Thanks you!

Dec 03 '20 14:12 kubibektas

What collectors are you running?

On Fri, 4 Dec 2020 at 1:26 am, kubibektas [email protected] wrote:

hi,

we are using metrics collector as a centralized Kubernetes pod that receives metrics from all the application pods. As we have more metrics, the collector pod (metrics server) stops functioning properly and we get ruby_collector_working 0. We noticed the pod was getting CPU throttled and increased the resources for it but would it be possible to add a health check endpoint so that Kubernetes would detect it automatically and restart the pod through a liveness probe?

I saw there was a closed issue for the same feature (#69 https://github.com/discourse/prometheus_exporter/issues/69) But wanted to raise it again as it seems to be a useful functionality.

Thanks you!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/discourse/prometheus_exporter/issues/145, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAABIXIMT3PWO445UTQGS43SS6N2LANCNFSM4UMBTEGQ .

Dec 03 '20 20:12 SamSaffron

Hi Sam, thanks for the response.

We are just running server as bin/prometheus_exporter and have sidekiq instrumentation on client pods. But our main use case is for reporting our custom metrics related to our application (like number of orders etc). Our problem is that, we are reporting too many metrics and running the server as a single pod. At some point the server gets throttled due to the high number of metrics. In such cases we just want to restart the server and continue reporting metrics. It's not possible to do this automatically right now since we don't have a liveness probe to be used by Kubernetes.

Dec 07 '20 16:12 kubibektas

We would also like to have this feature 👍

Jan 24 '21 18:01 h0jeZvgoxFepBQ2C

I am open to have a PR that adds a trivial health check at so /status it can return an OK status 200 page.

Jan 25 '21 23:01 SamSaffron

I am open to have a PR that adds a trivial health check at so /status it can return an OK status 200 page.

Fixed in https://github.com/discourse/prometheus_exporter/commit/27a768932b81bb6308be761468dbb16c6c55ab0c PR: https://github.com/discourse/prometheus_exporter/pull/226

Oct 18 '22 20:10 n-rodriguez

prometheus_exporter prometheus_exporter copied to clipboard

Add health check endpoint for the collector

prometheus_exporter
prometheus_exporter copied to clipboard