scalene
scalene copied to clipboard
how to work with gunicorn?
Can you describe the issue in more detail?
I want to trace my gunicorn web app.
How scalene work together with gunicorn?
I have never used gunicorn but it's quite likely that scalene doesn't work with it yet. What happens when you try it?
It's not obvious to me what needs to be done to make this work, and yours is the first and only request so far about gunicorn. Here are some pointers on how to profile with gunicorn (notably, with cProfile, which does not do line-level profiling or memory profiling). If these don't do the trick for you, let me know. Test code is also always welcome.
https://medium.com/@maxmaxmaxmax/measuring-performance-of-python-based-apps-using-gunicorn-and-cprofile-ee4027b2be41 https://github.com/what-studio/profiling/issues/31
Just popping in to say that I too would be very interested in doing memory profiling of a gunicorn web app.
Particularly with a high performance ASGI based framework such as FastAPI. But those details are probably not pertinent 😄 .
If you can provide me with an example app I can run, that would help immeasurably.
@emeryberger
Here's the app I'm actually interested in profiling, it's probably more complicated than you need to work as a simple example.
https://github.com/ExpDev07/coronavirus-tracker-api
I had been using Locust to load test against, was trying to figure out the best way to combine that with a profiler.
If you're interested I can send you the locustfile I had setup too.
I'll try to find you a simpler toy app (or build one) to play around with.
Whatever you can send me that lets me run it out of the box is great, and yes, the simpler, the better - thanks!
HI Emery,
I set up a sample gunicorn/flask app here (set up for cProfile):
https://github.com/calpaterson/example-gunicorn-app-profiling
This is probably one of the more common setups for a web application. You tend to have a (more substantial) web application with loads of request endpoints and you set it up like so, then you start it and use it a bit, and then when you shut it down you get the profile data saved to a pstats file. That file can then be visualised in snakeviz or similar.
The tricky thing about gunicorn/uwsgi/uvicorn and friends is that they are the entrypoint to your program - they call you via the WSGI protocol. This means you need to use their hooks to set up your profiling, it's not enough to try to something like python -m my_profile gunicorn.
Looks like Scalene.start() and Scalene.stop() would be the analogous functions to use for Scalene. I can sort of see the path to using them but maybe they need a bit of adjustment to ensure they can find whatever file is under profile - cProfile doesn't care about that and just profiles "everything".
Hope that helps
I will subscribe to this thread, because I also have a gunicorn app and resources usage profiling is often needed
Using this code, I just successfully got a profile with Scalene using the following command:
python3 -m scalene --reduced-profile --profile-all --profile-only "/app.py" `which gunicorn` app:app
When does that example write output? When cancelling? or during the gunicorn process?
@emeryberger I followed your advice and I'm able to get scalene to run. But how do I get it to output to a file?
scalene --profile-all --outfile /application/profiler.html --html --reduced-profile --profile-only premagic `which gunicorn` --preload premagic:app -b 0.0.0.0:5000
This is what I'm trying. I'm running this in a docker container. I'm expecting it to push info to the file or at least when the program exists.
@nambrosini-codes Were you able to figure out how to get the output in a file?
An update; although it appears to run, it does not create an output and none of the requests are processed from the web server. Is there a way to run scalene like memory_profiler ?
You have a few choices:
-
If the web app ends normally, then it should produce output.
-
If you are going to run it continuously, you can run it in the background and then send it a
--offsignal - this will both suspend profiling and cause it to output a profile. You can send an--onsignal to resume profiling.
From scalene --help:
When running Scalene in the background, you can suspend/resume profiling
for the process ID that Scalene reports. For example:
% python3 -m scalene yourprogram.py &
Scalene now profiling process 12345
to suspend profiling: python3 -m scalene.profile --off --pid 12345
to resume profiling: python3 -m scalene.profile --on --pid 12345
- You can run it with the
--profile-intervaloption to have it output a profile every N seconds.
--profile-interval PROFILE_INTERVAL
output profiles every so many seconds (default: inf)
Here is a command line I just used to generate JSON files which can be loaded into the GUI (http://plasma-umass.org/scalene-gui/; I should probably add a command-line option to Scalene to run this locally):
python3.11 -m scalene --json --outfile profile.json --profile-interval 5 --profile-all --profile-exclude site-packages,framework --- -m gunicorn app:app
Please give one or more of these a try and let me know how it works out!
@mevinbabuc
An update; although it appears to run, it does not create an output and none of the requests are processed from the web server. Is there a way to run scalene like memory_profiler ?
I wonder if you solved this problem somehow? I'm also encountered the problem of Gunicorn app not responding if launched through Scalene.
Hope it's ok for me to jump in - I've just come across scalene and was keen to use it on our gunicorn app for analysing memory usage by line, but haven't got it working yet...
Launching
In terms of how to launch it, I get this error when trying to run gunicorn via scalene as above with a seond -m after ---:
python -m scalene --json --outfile profile.json --- -m gunicorn -h
Scalene failed to initialize.
Traceback (most recent call last):
File "site-packages/scalene/scalene_profiler.py", line 207, in _get_module_details
File "site-packages/scalene/scalene_profiler.py", line 218, in _get_module_details
ImportError: 'gunicorn.__main__' is a namespace package and cannot be executed
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "site-packages/scalene/scalene_profiler.py", line 2063, in run_profiler
File "site-packages/scalene/scalene_profiler.py", line 211, in _get_module_details
ImportError: 'gunicorn.__main__' is a namespace package and cannot be executed; 'gunicorn' is a package and cannot be directly executed
(within a venv where python -m gunicorn -h works and shows me the gunicorn help output)
Whereas I get much further if I do this instead:
python -m scalene --json --outfile profile.json --- /full/path/to/venv/bin/gunicorn -h
Output profile has no memory data
However sadly the profile.json contains no memory data for some reason:
- a profile.json is produced
"samples"is either empty or has a small number of entries all with0as the second number"files"only haspython3.11/threading.pyandpython3.11/site-packages/gunicorn/arbiter.pyboth with some CPU usage recorded on a few lines, but all the memory numbers are zero and always"memory_samples": []- I'm wondering whether scalene is only seeing things in the parent gunicorn process and not the child workers? though it's still odd that there is no memory data for the parent either
- this is all on amazonlinux running in Docker
I've tried with --profile-all and --malloc-threshold 10 to try and increase the chances of data being captured from my request handlers, and ensured that some non-trivial requests were handled by gunicorn (before doing python -m scalene.profile --off --pid NN to make it dump a profile.json) but still no luck.
(Edit: incidentally it's not clear to me which PID to direct the --off at in the process tree - I've tried each level separately, typically resulting in either no profile.json or the container exiting, whereas strangly if I do it to all matching processes like this pgrep -f "\-m scalene " | xargs -L1 -i sh -c 'echo; echo Poking scalene in PID {}; python -m scalene.profile --off --pid {}' then I get a 2mb profile.json, though with no memory data as described above... 🤷♂️)
Unexpected extra process in tree?
Interestingly without scalene the process tree looks like this in htop:
python -m gunicorn my.wsgi:application --bind 0.0.0.0:8002 --worker-class gthread --threads 2 --workers 1 ...
└── python -m gunicorn my.wsgi:application --bind 0.0.0.0:8002 --worker-class gthread --threads 2 --workers 1 ...
but when run with scalene as above it looks like this instead:
python -m scalene --json --outfile profile.json --profile-all ... --- /path/to/bin/gunicorn my.wsgi:application --bind 0.0.0.0:8002 ...
└── python -m scalene --json --outfile profile.json --profile-all ... --- /path/to/bin/gunicorn my.wsgi:application --bind 0.0.0.0:8002 ...
└── python -m scalene --json --outfile profile.json --profile-all ... --- /path/to/bin/gunicorn my.wsgi:application --bind 0.0.0.0:8002 ...
so I'm not sure why there's an extra process in the mix and whether this might be part of the issue.
I don't want to muddy the waters here and am happy to open a separate issue if that would help - equally I'm all ears for anything I can try to get this working properly!