openpoiservice icon indicating copy to clipboard operation
openpoiservice copied to clipboard

Optimize gunicorn settings running with docker

Open TimMcCauley opened this issue 7 years ago • 19 comments

Sporadically the gunicorn workers time out - this may be due to the worker class settings: http://docs.gunicorn.org/en/stable/settings.html

TimMcCauley avatar Apr 14 '18 08:04 TimMcCauley

https://pythonspeed.com/articles/gunicorn-in-docker/

TimMcCauley avatar May 22 '19 13:05 TimMcCauley

@zephylac have you any experience with gunicorn settings? Sometimes requests are timing out on our live servers using the following settings:

workers = 2
worker_class = 'gevent'
worker_connections = 1000
timeout = 30
keepalive = 2

I now am trying the following settings instead which are recommended in the post above.

worker_class = 'gthread'
threads = 4

TimMcCauley avatar May 22 '19 13:05 TimMcCauley

I don't have any experience with gunicorn but I can try to have a look into it and find some info.

I'm currently spamming my instance with request but I didn't experienced any timeout (for now).

zephylac avatar May 22 '19 13:05 zephylac

I've looked into it a little bit. In the article you mentionned they were also talking about --worker-tmp-dir which might cause problems to workers.

I've already seen some info about threads option. Opinions seemed to converged to threads = workers. It seems that the ‘(solution)[https://www.brianstorti.com/the-role-of-a-reverse-proxy-to-protect-your-application-against-slow-clients/]’ some found was to expose NGINX in front of gunicorn.

On my side I've tried to timeout my workers (without changing current gunicorn parameters). On both extreme load or rest, my workers don't seem to timeout.

zephylac avatar May 22 '19 14:05 zephylac

Thanks for looking this up @zephylac - if you are running your batch requests, could you also run them against api.openrouteservice.org at the same time? I can send you a token allowing a higher quota - if you agree - which email could I send the token to?

TimMcCauley avatar May 22 '19 15:05 TimMcCauley

I've sent you an email !

zephylac avatar May 22 '19 15:05 zephylac

Under which architecture are you running your service ? Are you using docker ? Are you running on VM or dedicated ?

zephylac avatar May 23 '19 12:05 zephylac

We are running this on a VM in our openstack environment with 32GB RAM and 8 cores. The postgis database is running on a different and smaller VM with unfortunately with very slow disks (which soon will be updated to SSDs). The containers running on this VM are

ubuntu@ors-microservices:~|⇒  sudo docker ps
CONTAINER ID        IMAGE                                      COMMAND                  CREATED             STATUS              PORTS                      NAMES
68404976f9d6        openelevationservice_gunicorn_flask_2      "/oes_venv/bin/gun..."   8 weeks ago         Up 2 days           0.0.0.0:5021->5000/tcp     openelevationservice_gunicorn_flask_2_1
6959766a7ee9        openelevationservice_gunicorn_flask        "/oes_venv/bin/gun..."   8 weeks ago         Up 2 days           0.0.0.0:5020->5000/tcp     openelevationservice_gunicorn_flask_1
ec736d4cd30c        openpoiservice_gunicorn_flask_05122018_2   "/ops_venv/bin/gun..."   5 months ago        Up 24 hours         0.0.0.0:5006->5000/tcp     openpoiservice_gunicorn_flask_05122018_2_1
c62417a4f60e        openpoiservice_gunicorn_flask_05122018     "/ops_venv/bin/gun..."   5 months ago        Up 24 hours         0.0.0.0:5005->5000/tcp     openpoiservice_gunicorn_flask_05122018_1

TimMcCauley avatar May 23 '19 13:05 TimMcCauley

Does the workers are timing out even on idle ? Or just under load ?

I've looked on my logs, none of my workers have timed out during 1 week of intense load.

zephylac avatar May 23 '19 13:05 zephylac

Some requests will simply timeout but I haven't found a pattern for this yet.

TimMcCauley avatar May 23 '19 20:05 TimMcCauley

Maybe PostgreSQL12 & PostGIS 3 will fix a part of this issue by supporting correctly the parallelization.

zephylac avatar Jun 06 '19 15:06 zephylac

Agreed. Did you test the live API with the token I sent you by any chance @zephylac ?

TimMcCauley avatar Jun 07 '19 20:06 TimMcCauley

Yup I tried but it seems it has expired.

zephylac avatar Jun 09 '19 08:06 zephylac

Ah shit, sorry - it's now extended forever ;-) and won't expire anymore (same token as in the email).

TimMcCauley avatar Jun 09 '19 17:06 TimMcCauley

Hi @TimMcCauley, obviously it has been a while, but as I am facing the same issue you described (random timeouts with larger batches of POI requests using docker) I am wondering, if you have found a solution?

boind12 avatar Nov 10 '20 18:11 boind12

maybe this topic might help? https://pythonspeed.com/articles/gunicorn-in-docker/

lingster avatar Nov 14 '20 11:11 lingster

Hi @lingster, this link was mentioned earlier by Tim. I was unable to solve the problem using it.

boind12 avatar Nov 15 '20 15:11 boind12

Sorry for joining the party so late.

@boind12 could you run ANALYZE in the ops schema once and check again? What kind of requests are you running and are you able to do the same directly in SQL and see how it behaves (you can print the sql query and fill the placeholders manually)? How much memory are you giving Docker and have you played around with pgtune settings? In a nutshell: it's most likely a postgres issue.

TimMcCauley avatar Nov 15 '20 18:11 TimMcCauley

Hi @TimMcCauley, thanks for your support! I am using the following setup:

  • Host: 16GB, 2vCPU with 50GB SSD (Google Cloud e2-highmem)
  • The host is running:
    • 1x Openrouteservice: https://github.com/GIScience/openrouteservice
    • 1x Openpoiservice
    • 1x postgis: https://hub.docker.com/r/kartoza/postgis/ I am running large batch request for POIs with >50km2 area, hence I assume it takes some of them longer then the 30s timeout of the gunicorn from openpoiservice. By increasing the timeout of the gunicorn runner to 60s I was able to solve the issue. However I now migrated the postgis from the VM to a dedicated Google PostGre instance. Maybe this helps further.

boind12 avatar Nov 17 '20 07:11 boind12