prometheus-fastapi-instrumentator icon indicating copy to clipboard operation
prometheus-fastapi-instrumentator copied to clipboard

Strange drops in total requests

Open sterres opened this issue 3 years ago • 7 comments

Hi,

I'm getting strange drops in the http_requests_total metric for the "/metrics" endpoint. I was expecting a monotonic increase as with each scrape, the "/metrics" counter should increase by one.

But it looks like that: image

Any idea what I'm doing wrong?

Thanks and BR Simon

sterres avatar Jul 07 '21 14:07 sterres

It seems to be related to multiprocess workers of gunicorn server (I used the Docker image: https://github.com/tiangolo/uvicorn-gunicorn-fastapi-docker).

It works fine, when setting environment variable MAX_WORKERS="1" for the FastAPI container.

Some instructions on how to solve it can be found here: https://github.com/prometheus/client_python#multiprocess-mode-eg-gunicorn

But I don't know how the solution can be implemented in this tool. If someone managed to get it working I would be happy for help :)

sterres avatar Jul 08 '21 09:07 sterres

Even though @sterres basically mentions all resources to solve this issue, it took me quite some time to do so myself and I want to share how I managed to get the fastapi instrumentator running on the gunicorn server with the Docker image from https://github.com/tiangolo/uvicorn-gunicorn-fastapi-docker.

To be able to get reasonable data from gunicorn with more than one workers and an indiviual metrics port do the following:

  1. Add provision of multiprocess registry in gunicorn.conf (add the following to the default):
from prometheus_client import start_http_server, multiprocess, CollectorRegistry

def when_ready(server):
    registry = CollectorRegistry()
    multiprocess.MultiProcessCollector(registry)
    start_http_server(METRICS_PORT, registry=registry)

def child_exit(server, worker):
    multiprocess.mark_process_dead(worker.pid)

If you do not need your own port for the metrics, remove the start_http_server and modify the code that to instumentator publishes the data of the multiprocess collector in the main app.

  1. Add an environmental variable: e.g. PROMETHEUS_MULTIPROC_DIR=/tmp_multiproc
  2. Make sure to create an empty directory for the temporal directory. With this specific container, use the prestart.sh script:
#! /usr/bin/env bash
if [ -d /tmp_multiproc ]; then rm -Rf /tmp_multiproc; fi
mkdir /tmp_multiproc

This script removes the directory if already in place and recreates it. Deleting is necessary, as container restarts fail otherwise.

saschnet avatar Aug 17 '21 12:08 saschnet

@saschnet FWIW it looks like this project supports multiprocess collection by simply setting the "prometheus_multiproc_dir" environment variable.

https://github.com/trallnag/prometheus-fastapi-instrumentator/blame/master/prometheus_fastapi_instrumentator/instrumentation.py#L257-L267

neilferreira avatar Dec 08 '21 06:12 neilferreira

Even though @sterres basically mentions all resources to solve this issue, it took me quite some time to do so myself and I want to share how I managed to get the fastapi instrumentator running on the gunicorn server with the Docker image from https://github.com/tiangolo/uvicorn-gunicorn-fastapi-docker.

To be able to get reasonable data from gunicorn with more than one workers and an indiviual metrics port do the following:

  1. Add provision of multiprocess registry in gunicorn.conf (add the following to the default):
from prometheus_client import start_http_server, multiprocess, CollectorRegistry

def when_ready(server):
    registry = CollectorRegistry()
    multiprocess.MultiProcessCollector(registry)
    start_http_server(METRICS_PORT, registry=registry)

def child_exit(server, worker):
    multiprocess.mark_process_dead(worker.pid)

If you do not need your own port for the metrics, remove the start_http_server and modify the code that to instumentator publishes the data of the multiprocess collector in the main app.

  1. Add an environmental variable: e.g. PROMETHEUS_MULTIPROC_DIR=/tmp_multiproc
  2. Make sure to create an empty directory for the temporal directory. With this specific container, use the prestart.sh script:
#! /usr/bin/env bash
if [ -d /tmp_multiproc ]; then rm -Rf /tmp_multiproc; fi
mkdir /tmp_multiproc

This script removes the directory if already in place and recreates it. Deleting is necessary, as container restarts fail otherwise.

Hello, I am having the same issue, I am running my python app using gunicorn and the metrics are really very strange. I have followd your solution (except that I commented the 'start_http_server" line) but tit did not work. any idea please ? Thanks

nazzour avatar Dec 27 '21 20:12 nazzour

Hello, I am having the same issue, I am running my python app using gunicorn and the metrics are really very strange. I have followd your solution (except that I commented the 'start_http_server" line) but tit did not work. any idea please

If you visit your /metrics page, does it look like this?

# HELP foo_http_requests_total Multiprocess metric
# TYPE foo_http_requests_total counter

Importantly, indicating that it is using the Multiprocess metric?

If not, can you confirm if you're setting the prometheus_multiproc_dir environment variable and that the directory exists on your server/computer? If you have the means to do so, you can drop some debug statements into this chunk of code to determine what is going on https://github.com/trallnag/prometheus-fastapi-instrumentator/blame/master/prometheus_fastapi_instrumentator/instrumentation.py#L257

neilferreira avatar Dec 29 '21 13:12 neilferreira

modify the code that to instumentator publishes the data of the multiprocess collector in the main app.

@saschnet could you elaborate more on this?

IWillPull avatar May 17 '22 13:05 IWillPull

modify the code that to instumentator publishes the data of the multiprocess collector in the main app.

@saschnet could you elaborate more on this?

I only published the endpoint to a different port as explained so far. But I think simply exposing the endpoint as described in the documentation should be sufficient: https://github.com/trallnag/prometheus-fastapi-instrumentator#exposing-endpoint

Have you tried that yet?

saschnet avatar Jun 27 '22 18:06 saschnet

Even though @sterres basically mentions all resources to solve this issue, it took me quite some time to do so myself and I want to share how I managed to get the fastapi instrumentator running on the gunicorn server with the Docker image from https://github.com/tiangolo/uvicorn-gunicorn-fastapi-docker. To be able to get reasonable data from gunicorn with more than one workers and an indiviual metrics port do the following:

  1. Add provision of multiprocess registry in gunicorn.conf (add the following to the default):
from prometheus_client import start_http_server, multiprocess, CollectorRegistry

def when_ready(server):
    registry = CollectorRegistry()
    multiprocess.MultiProcessCollector(registry)
    start_http_server(METRICS_PORT, registry=registry)

def child_exit(server, worker):
    multiprocess.mark_process_dead(worker.pid)

If you do not need your own port for the metrics, remove the start_http_server and modify the code that to instumentator publishes the data of the multiprocess collector in the main app.

  1. Add an environmental variable: e.g. PROMETHEUS_MULTIPROC_DIR=/tmp_multiproc
  2. Make sure to create an empty directory for the temporal directory. With this specific container, use the prestart.sh script:
#! /usr/bin/env bash
if [ -d /tmp_multiproc ]; then rm -Rf /tmp_multiproc; fi
mkdir /tmp_multiproc

This script removes the directory if already in place and recreates it. Deleting is necessary, as container restarts fail otherwise.

Hello, I am having the same issue, I am running my python app using gunicorn and the metrics are really very strange. I have followd your solution (except that I commented the 'start_http_server" line) but tit did not work. any idea please ? Thanks

Hello, Sorry to come back to this issue but I'm trying to follow your indication to setup a different port of metrics but it seems do not work. Could you please help me? I'm using the same Docker image.

Thanks Paz

Pazzeo avatar Jan 16 '23 14:01 Pazzeo

Fixed in #42 / #217

trallnag avatar Feb 22 '23 22:02 trallnag