uwsgi
uwsgi copied to clipboard
Using `max-worker-lifetime` kills workers without waiting for pending requests to finish
It seems that if one uses max-worker-lifetime
, the master process will trigger a kill
towards the worker process, which will immediately exit without waiting for the pending request to finish.
In order to reproduce this configure the following:
master = true
workers = 2
threads = 0
enable-threads = false
# Python related configuration
# NOTE: See the note bellow about this small limit.
max-worker-lifetime = 60
reload-mercy = 30
worker-reload-mercy = 30
reload-on-exception = true
exit-on-reload = true
reaper = true
die-on-term = true
Then start in a loop requests that take say 10% of that max-worker-lifetime
(a simple Python app that just does time.sleep(6)
would do). Thus almost one in 10 requests will be reset due to the master killing the worker process.
Regarding the max-worker-lifetime = 60
limit: in production I used 3600
(which is quite enough), however in order to reliably trigger this issue I've set this limit small enough.
Had a similar issue with still running requests being killed when using max_requests. It only seems to take the last spawn into account, while there can still be running requests.
@timdrijvers Have you found a workaround?
(In my case I just ignored the issue, and restart the whole application once a couple of weeks, as I don't have any "heavy" memory-leaks.) :(
I am experiencing this as well.
Same here, see https://stackoverflow.com/questions/58731398/uwsgi-worker-respawning-although-the-request-is-not-yet-finished
I just experienced this also. 😞
I'm also hitting this issue. And I think it the cause of the production issue seen in #2480
For me, what happen is:
- uWSGI start the worker loop here https://github.com/unbit/uwsgi/blob/066b7fdf1bfa12f95652b8bd8cbd3532c8096b91/core/uwsgi.c#L3714
- For all additional threads (id > 0) it will start simple_loop_run function: https://github.com/unbit/uwsgi/blob/066b7fdf1bfa12f95652b8bd8cbd3532c8096b91/core/loop.c#L62
- All threads (additional and main thread) will loop on https://github.com/unbit/uwsgi/blob/066b7fdf1bfa12f95652b8bd8cbd3532c8096b91/core/loop.c#L138
- When lifetime is reached, master request to stop processing by changing manage_next_request (https://github.com/unbit/uwsgi/blob/066b7fdf1bfa12f95652b8bd8cbd3532c8096b91/core/master_checks.c#L229)
- So all worker threads will exit the loop:
- any additional thread, will just terminate, without doing anything more (end of simple_loop_run function)
- BUT the main thread has more function in its stack, and will reach https://github.com/unbit/uwsgi/blob/066b7fdf1bfa12f95652b8bd8cbd3532c8096b91/core/uwsgi.c#L3722
- So the main thread, when exiting will exit() the process which does not wait for additional thread to finish processing.
I can reproduce getting HTTP error with the following:
cat > wsgi.py << EOF
import time
import uuid
def application(env, start_response):
req_id = uuid.uuid4()
with open("/tmp/req.log", "a") as fd:
fd.write(f"start {req_id}\n")
time.sleep(1)
with open("/tmp/req.log", "a") as fd:
fd.write(f"stop {req_id}\n")
start_response('200 OK', [('Content-Type','text/html')])
return [b"Hello World\n"]
EOF
uwsgi --module wsgi:application --http 127.0.0.1:8080 --master --workers 1 --threads 2 --max-worker-lifetime 20
Have a client doing request in a loop (I'm using sh -ec
to stop on first error):
time sh -ec 'while true; do curl http://localhost:8080/;done'
I usually get error one 1st or 2nd restart of the worker (i.e. in 20 - 40 seconds). In addition if you look at req.log, you will see that on error some request didn't finished (a start $UUID
is present without stop $UUID
)
None of those behavior happen if adding a wait_for_threads();
just before the end_me(0);
I believe this was fixed in #2626 which is now commit 06a22597bd419860904fae6f446d8e3b714f5afa
The fix would have shipped in v2.0.25, though the version's release notes are less explicit about the change than the commits https://github.com/unbit/uwsgi/compare/2.0.24...2.0.25