scalene how to work with gunicorn?

Feb 19 '20 02:02 koolay

Can you describe the issue in more detail?

Feb 19 '20 02:02 emeryberger

I want to trace my gunicorn web app.
How scalene work together with gunicorn?

Feb 19 '20 02:02 koolay

I have never used gunicorn but it's quite likely that scalene doesn't work with it yet. What happens when you try it?

Feb 19 '20 05:02 emeryberger

It's not obvious to me what needs to be done to make this work, and yours is the first and only request so far about gunicorn. Here are some pointers on how to profile with gunicorn (notably, with cProfile, which does not do line-level profiling or memory profiling). If these don't do the trick for you, let me know. Test code is also always welcome.

https://medium.com/@maxmaxmaxmax/measuring-performance-of-python-based-apps-using-gunicorn-and-cprofile-ee4027b2be41 https://github.com/what-studio/profiling/issues/31

Feb 23 '20 01:02 emeryberger

Just popping in to say that I too would be very interested in doing memory profiling of a gunicorn web app. Particularly with a high performance ASGI based framework such as FastAPI. But those details are probably not pertinent 😄 .

May 02 '20 20:05 Kilo59

If you can provide me with an example app I can run, that would help immeasurably.

May 02 '20 20:05 emeryberger

@emeryberger Here's the app I'm actually interested in profiling, it's probably more complicated than you need to work as a simple example. https://github.com/ExpDev07/coronavirus-tracker-api I had been using Locust to load test against, was trying to figure out the best way to combine that with a profiler. If you're interested I can send you the locustfile I had setup too.

I'll try to find you a simpler toy app (or build one) to play around with.

May 02 '20 22:05 Kilo59

Whatever you can send me that lets me run it out of the box is great, and yes, the simpler, the better - thanks!

May 03 '20 01:05 emeryberger

HI Emery,

I set up a sample gunicorn/flask app here (set up for cProfile):

https://github.com/calpaterson/example-gunicorn-app-profiling

This is probably one of the more common setups for a web application. You tend to have a (more substantial) web application with loads of request endpoints and you set it up like so, then you start it and use it a bit, and then when you shut it down you get the profile data saved to a pstats file. That file can then be visualised in snakeviz or similar.

The tricky thing about gunicorn/uwsgi/uvicorn and friends is that they are the entrypoint to your program - they call you via the WSGI protocol. This means you need to use their hooks to set up your profiling, it's not enough to try to something like python -m my_profile gunicorn.

Looks like Scalene.start() and Scalene.stop() would be the analogous functions to use for Scalene. I can sort of see the path to using them but maybe they need a bit of adjustment to ensure they can find whatever file is under profile - cProfile doesn't care about that and just profiles "everything".

Hope that helps

May 17 '20 21:05 calpaterson

I will subscribe to this thread, because I also have a gunicorn app and resources usage profiling is often needed

Apr 07 '22 12:04 frankiedrake

Using this code, I just successfully got a profile with Scalene using the following command: python3 -m scalene --reduced-profile --profile-all --profile-only "/app.py" `which gunicorn` app:app

Jul 30 '22 01:07 emeryberger

When does that example write output? When cancelling? or during the gunicorn process?

May 08 '23 22:05 nambrosini-codes

@emeryberger I followed your advice and I'm able to get scalene to run. But how do I get it to output to a file?

scalene --profile-all --outfile /application/profiler.html --html --reduced-profile --profile-only premagic `which gunicorn` --preload premagic:app -b 0.0.0.0:5000

This is what I'm trying. I'm running this in a docker container. I'm expecting it to push info to the file or at least when the program exists.

@nambrosini-codes Were you able to figure out how to get the output in a file?

Sep 02 '23 11:09 mevinbabuc

An update; although it appears to run, it does not create an output and none of the requests are processed from the web server. Is there a way to run scalene like memory_profiler ?

Sep 03 '23 13:09 mevinbabuc

You have a few choices:

If the web app ends normally, then it should produce output.
If you are going to run it continuously, you can run it in the background and then send it a --off signal - this will both suspend profiling and cause it to output a profile. You can send an --on signal to resume profiling.

From scalene --help:

When running Scalene in the background, you can suspend/resume profiling
for the process ID that Scalene reports. For example:

 % python3 -m scalene  yourprogram.py &
 Scalene now profiling process 12345
   to suspend profiling: python3 -m scalene.profile --off --pid 12345
   to resume profiling:  python3 -m scalene.profile --on  --pid 12345

You can run it with the --profile-interval option to have it output a profile every N seconds.

  --profile-interval PROFILE_INTERVAL
                        output profiles every so many seconds (default: inf)

Here is a command line I just used to generate JSON files which can be loaded into the GUI (http://plasma-umass.org/scalene-gui/; I should probably add a command-line option to Scalene to run this locally):

python3.11 -m scalene --json --outfile profile.json --profile-interval 5 --profile-all --profile-exclude site-packages,framework --- -m gunicorn app:app

Please give one or more of these a try and let me know how it works out!

Sep 04 '23 00:09 emeryberger

@mevinbabuc

An update; although it appears to run, it does not create an output and none of the requests are processed from the web server. Is there a way to run scalene like memory_profiler ?

I wonder if you solved this problem somehow? I'm also encountered the problem of Gunicorn app not responding if launched through Scalene.

Oct 10 '23 21:10 AndreiPashkin

Hope it's ok for me to jump in - I've just come across scalene and was keen to use it on our gunicorn app for analysing memory usage by line, but haven't got it working yet...

Launching

In terms of how to launch it, I get this error when trying to run gunicorn via scalene as above with a seond -m after ---: python -m scalene --json --outfile profile.json --- -m gunicorn -h

Scalene failed to initialize.
Traceback (most recent call last):
  File "site-packages/scalene/scalene_profiler.py", line 207, in _get_module_details
  File "site-packages/scalene/scalene_profiler.py", line 218, in _get_module_details
ImportError: 'gunicorn.__main__' is a namespace package and cannot be executed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "site-packages/scalene/scalene_profiler.py", line 2063, in run_profiler
  File "site-packages/scalene/scalene_profiler.py", line 211, in _get_module_details
ImportError: 'gunicorn.__main__' is a namespace package and cannot be executed; 'gunicorn' is a package and cannot be directly executed

(within a venv where python -m gunicorn -h works and shows me the gunicorn help output)

Whereas I get much further if I do this instead: python -m scalene --json --outfile profile.json --- /full/path/to/venv/bin/gunicorn -h

Output profile has no memory data

However sadly the profile.json contains no memory data for some reason:

a profile.json is produced
"samples" is either empty or has a small number of entries all with 0 as the second number
"files" only has python3.11/threading.py and python3.11/site-packages/gunicorn/arbiter.py both with some CPU usage recorded on a few lines, but all the memory numbers are zero and always "memory_samples": []
I'm wondering whether scalene is only seeing things in the parent gunicorn process and not the child workers? though it's still odd that there is no memory data for the parent either
this is all on amazonlinux running in Docker

I've tried with --profile-all and --malloc-threshold 10 to try and increase the chances of data being captured from my request handlers, and ensured that some non-trivial requests were handled by gunicorn (before doing python -m scalene.profile --off --pid NN to make it dump a profile.json) but still no luck.

(Edit: incidentally it's not clear to me which PID to direct the --off at in the process tree - I've tried each level separately, typically resulting in either no profile.json or the container exiting, whereas strangly if I do it to all matching processes like this pgrep -f "\-m scalene " | xargs -L1 -i sh -c 'echo; echo Poking scalene in PID {}; python -m scalene.profile --off --pid {}' then I get a 2mb profile.json, though with no memory data as described above... 🤷‍♂️)

Unexpected extra process in tree?

Interestingly without scalene the process tree looks like this in htop:

python -m gunicorn my.wsgi:application --bind 0.0.0.0:8002 --worker-class gthread --threads 2 --workers 1 ...
└── python -m gunicorn my.wsgi:application --bind 0.0.0.0:8002 --worker-class gthread --threads 2 --workers 1 ...

but when run with scalene as above it looks like this instead:

python -m scalene --json --outfile profile.json --profile-all ... --- /path/to/bin/gunicorn my.wsgi:application --bind 0.0.0.0:8002 ...
└── python -m scalene --json --outfile profile.json --profile-all ... --- /path/to/bin/gunicorn my.wsgi:application --bind 0.0.0.0:8002 ...
    └── python -m scalene --json --outfile profile.json --profile-all ... --- /path/to/bin/gunicorn my.wsgi:application --bind 0.0.0.0:8002 ...

so I'm not sure why there's an extra process in the mix and whether this might be part of the issue.

I don't want to muddy the waters here and am happy to open a separate issue if that would help - equally I'm all ears for anything I can try to get this working properly!

Oct 12 '23 12:10 sparrowt

scalene scalene copied to clipboard

how to work with gunicorn?

Launching

Output profile has no memory data

Unexpected extra process in tree?

scalene
scalene copied to clipboard