nginx-prometheus-exporter HTTP server randomly closes, offers vague reason, refuses to elaborate further

Describe the bug I'm running nginx-prometheus-exporter as a container sitting next to my nginx container. The container with NPE randomly dies for no reason. It just logs:

{"time": actual time,"level":"INFO","source":"exporter.go:217","msg":"shutting down"}
{"time": actual time,"level":"INFO","source":"exporter.go:208","msg":"HTTP server closed","error":"http: server closed"}

Despite the fact that NPE was started with --log.level=debug, there's really nothing elaborated on why the HTTP server shut down.

To reproduce Steps to reproduce the behavior:

Deploy NPE with --log.level=debug.
Wait 4 minutes, or 4 hours, or 12 hours.

Expected behavior NPE should explain why it shut down so that I actually fix it.

Your environment

Version of the Prometheus exporter - 1.4.1
Version of Docker/Kubernetes - not relevant
[if applicable] Kubernetes platform (e.g. Mini-kube or GCP): Mirantis
Using NGINX or NGINX Plus: NGINX

Mar 06 '25 10:03 grepwood

Hi @grepwood! Welcome to the project! 🎉

Thanks for opening this issue! Be sure to check out our Contributing Guidelines and the Issue Lifecycle while you wait for someone on the team to take a look at this.

Mar 06 '25 10:03 nginx-bot[bot]

I'm facing the exact same issue for a couple of weeks now. I'm also running it in Kubernetes with 3 pods, in a Production environment. Here are the data I managed to collect.

What I managed to observe is the following by logging in the Nginx container's shell. Even though the Nginx /stub_status endpoint responds normally with the metrics, I'm not able to get a reply from the Nginx Prometheus Exporter container. The request just hangs indefinitely.

(I interrupted the execution after more than 1 min.)

/ $ date && time curl -v 127.0.0.1:8000/stub_status && echo && echo && echo && date && time curl -v 127.0.0.1:9113/metrics; date
Thu May  8 21:03:24 UTC 2025

*   Trying 127.0.0.1:8000...
* Connected to 127.0.0.1 (127.0.0.1) port 8000
* using HTTP/1.x
> GET /stub_status HTTP/1.1
> Host: 127.0.0.1:8000
> User-Agent: curl/8.12.1
> Accept: */*
>
* Request completely sent off
< HTTP/1.1 200 OK
< Server: nginx/1.26.3
< Date: Thu, 08 May 2025 21:03:24 GMT
< Content-Type: text/plain
< Content-Length: 107
< Connection: keep-alive
<
Active connections: 36
server accepts handled requests
 313 313 14525
Reading: 0 Writing: 6 Waiting: 30
* Connection #0 to host 127.0.0.1 left intact
real	0m 0.00s
user	0m 0.00s
sys	0m 0.00s



Thu May  8 21:03:24 UTC 2025

*   Trying 127.0.0.1:9113...
* Connected to 127.0.0.1 (127.0.0.1) port 9113
* using HTTP/1.x
> GET /metrics HTTP/1.1
> Host: 127.0.0.1:9113
> User-Agent: curl/8.12.1
> Accept: */*
>
* Request completely sent off
^C Command terminated by signal 2
real	1m 24.84s
user	0m 0.00s
sys	0m 0.00s

Thu May  8 21:04:49 UTC 2025

Logs from the containers

1746735622753	time=2025-05-08T20:20:22.753Z level=INFO source=exporter.go:217 msg="shutting down"
1746735622753	time=2025-05-08T20:20:22.753Z level=INFO source=exporter.go:208 msg="HTTP server closed" error="http: Server closed"
1746735622872	time=2025-05-08T20:20:22.872Z level=INFO source=exporter.go:123 msg=nginx-prometheus-exporter version="(version=1.4.2, branch=HEAD, revision=ced6fda825f88077debfacab8d82536ce502bb17)"
1746735622872	time=2025-05-08T20:20:22.872Z level=INFO source=exporter.go:124 msg="build context" build_context="(go=go1.24.2, platform=linux/amd64, user=goreleaser, date=2025-04-28T15:24:56Z, tags=unknown)"
1746735622875	time=2025-05-08T20:20:22.875Z level=INFO source=tls_config.go:347 msg="Listening on" address=[::]:9113
1746735622875	time=2025-05-08T20:20:22.875Z level=INFO source=tls_config.go:350 msg="TLS is disabled." http2=false address=[::]:9113
1746735828041	time=2025-05-08T20:23:48.041Z level=INFO source=exporter.go:217 msg="shutting down"
1746735828041	time=2025-05-08T20:23:48.041Z level=INFO source=exporter.go:208 msg="HTTP server closed" error="http: Server closed"
1746735828155	time=2025-05-08T20:23:48.155Z level=INFO source=exporter.go:123 msg=nginx-prometheus-exporter version="(version=1.4.2, branch=HEAD, revision=ced6fda825f88077debfacab8d82536ce502bb17)"
1746735828155	time=2025-05-08T20:23:48.155Z level=INFO source=exporter.go:124 msg="build context" build_context="(go=go1.24.2, platform=linux/amd64, user=goreleaser, date=2025-04-28T15:24:56Z, tags=unknown)"
1746735828158	time=2025-05-08T20:23:48.158Z level=INFO source=tls_config.go:347 msg="Listening on" address=[::]:9113
1746735828158	time=2025-05-08T20:23:48.158Z level=INFO source=tls_config.go:350 msg="TLS is disabled." http2=false address=[::]:9113
1746735848029	time=2025-05-08T20:24:08.029Z level=INFO source=exporter.go:217 msg="shutting down"
1746735848029	time=2025-05-08T20:24:08.029Z level=INFO source=exporter.go:208 msg="HTTP server closed" error="http: Server closed"
1746735848194	time=2025-05-08T20:24:08.194Z level=INFO source=exporter.go:123 msg=nginx-prometheus-exporter version="(version=1.4.2, branch=HEAD, revision=ced6fda825f88077debfacab8d82536ce502bb17)"
1746735848194	time=2025-05-08T20:24:08.194Z level=INFO source=exporter.go:124 msg="build context" build_context="(go=go1.24.2, platform=linux/amd64, user=goreleaser, date=2025-04-28T15:24:56Z, tags=unknown)"
1746735848198	time=2025-05-08T20:24:08.197Z level=INFO source=tls_config.go:347 msg="Listening on" address=[::]:9113
1746735848198	time=2025-05-08T20:24:08.197Z level=INFO source=tls_config.go:350 msg="TLS is disabled." http2=false address=[::]:9113

Events

Events:
  Warning  Unhealthy                20m                kubelet                  Readiness probe failed: Get "http://240.48.0.181:9113/metrics": EOF
  Normal   Pulled                   20m (x2 over 22m)  kubelet                  Container image "nginx/nginx-prometheus-exporter:1.4.2" already present on machine
  Normal   Started                  20m (x2 over 22m)  kubelet                  Started container nginx-prometheus-exporter
  Normal   Killing                  20m                kubelet                  Container nginx-prometheus-exporter failed liveness probe, will be restarted
  Normal   Created                  20m (x2 over 22m)  kubelet                  Created container: nginx-prometheus-exporter
  Warning  Unhealthy                77s (x7 over 20m)  kubelet                  Liveness probe failed: Get "http://240.48.0.181:9113/metrics": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
  Warning  Unhealthy                67s (x8 over 20m)  kubelet                  Readiness probe failed: Get "http://240.48.0.181:9113/metrics": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

Events:
  Normal   Started                  19m (x2 over 22m)  kubelet                  Started container nginx-prometheus-exporter
  Normal   Created                  19m (x2 over 22m)  kubelet                  Created container: nginx-prometheus-exporter
  Normal   Pulled                   19m (x2 over 22m)  kubelet                  Container image "nginx/nginx-prometheus-exporter:1.4.2" already present on machine
  Normal   Killing                  19m                kubelet                  Container nginx-prometheus-exporter failed liveness probe, will be restarted
  Warning  Unhealthy                19m                kubelet                  Readiness probe failed: Get "http://240.48.0.73:9113/metrics": EOF
  Warning  Unhealthy                8s (x10 over 19m)  kubelet                  Readiness probe failed: Get "http://240.48.0.73:9113/metrics": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
  Warning  Unhealthy                8s (x10 over 19m)  kubelet                  Liveness probe failed: Get "http://240.48.0.73:9113/metrics": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

Events:
  Normal   Started                  20m (x2 over 22m)  kubelet                  Started container nginx-prometheus-exporter
  Normal   Created                  20m (x2 over 22m)  kubelet                  Created container: nginx-prometheus-exporter
  Normal   Pulled                   20m (x2 over 22m)  kubelet                  Container image "nginx/nginx-prometheus-exporter:1.4.2" already present on machine
  Normal   Killing                  20m                kubelet                  Container nginx-prometheus-exporter failed liveness probe, will be restarted
  Warning  Unhealthy                20m                kubelet                  Readiness probe failed: Get "http://240.48.0.60:9113/metrics": EOF
  Warning  Unhealthy                83s (x7 over 20m)  kubelet                  Readiness probe failed: Get "http://240.48.0.60:9113/metrics": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
  Warning  Unhealthy                78s (x8 over 20m)  kubelet                  Liveness probe failed: Get "http://240.48.0.60:9113/metrics": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

Configuration

args:
  - --nginx.scrape-uri=http://127.0.0.1:8000/stub_status
  - --log.level=debug
image: nginx/nginx-prometheus-exporter:1.4.2
imagePullPolicy: IfNotPresent
livenessProbe:
  failureThreshold: 3
  httpGet:
    path: /metrics
    port: 9113
    scheme: HTTP
  initialDelaySeconds: 10
  periodSeconds: 5
  successThreshold: 1
  timeoutSeconds: 3
name: nginx-prometheus-exporter
ports:
  - containerPort: 9113
    name: nginx-metrics
    protocol: TCP
readinessProbe:
  failureThreshold: 3
  httpGet:
    path: /metrics
    port: 9113
    scheme: HTTP
  initialDelaySeconds: 10
  periodSeconds: 5
  successThreshold: 1
  timeoutSeconds: 3
resources:
  limits:
    memory: 64Mi
  requests:
    cpu: 100m
    memory: 64Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
  - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
    name: kube-api-access-4pnp9
    readOnly: true

In the meanwhile, we lose the metrics in Grafana.

May 08 '25 21:05 diogokiss

Hey! Anyone managed to take a look at this? Or found out a solution? 👀

Jun 12 '25 15:06 diogokiss

Hi @diogokiss !

I apologise for not getting back to you earlier. Have you found a solution to this since?

If not, it looks like the liveness and readiness probes fail when kubernetes tries to check on prometheus exporter, and the pods get recycled. It also looks like the nginx binary is in the same pod (scrape url is 127.0.0.1).

Would you be able to give us more information around the pod, the ports, whether nginx and prometheus exporter are in the same pod, is PE used as a sidecar?

Can you manually ping the /metrics endpoint on port 9113?

You also wrote

Even though the Nginx /stub_status endpoint responds normally with the metrics,

Would you be able to tell us how you checked? Was it from outside the pod / kubernetes, or from within the pod / kubernetes?

If you have any other info to add, let us know! Thank you for your patience!

Nov 03 '25 16:11 javorszky

Hi @javorszky,

While not the person you asked, I get this in the exact conditions you specify:

It also looks like the nginx binary is in the same pod (scrape url is 127.0.0.1).

Yes. Same pod, different container.

Can you manually ping the /metrics endpoint on port 9113? Would you be able to tell us how you checked? Was it from outside the pod / kubernetes, or from within the pod / kubernetes?

Within the same pod, from any container in that pod. Outside of that, not at all because I deliberately configured nginx to not let anyone else than 127.0.0.1 check /stub_status.

Nov 03 '25 20:11 grepwood

nginx-prometheus-exporter nginx-prometheus-exporter copied to clipboard

HTTP server randomly closes, offers vague reason, refuses to elaborate further

nginx-prometheus-exporter
nginx-prometheus-exporter copied to clipboard