taskiq
taskiq copied to clipboard
Worker hangs sometimes
Sometimes we fall into state, when taskiq worker hangs and doesn't react to signals like SIGTERM.
We need to fix it, since it's really annoying. Maybe we should differently work with child processes.
I refactored process manager and since then there were no hangs.
But I do believe that process manager can be updated, so it will react instantly to the events.
One of the proposed solutions would be:
- Create a separate thread, that checks if processes are dead;
- Remove sleep from the main process-watcher loop;
- Update main loop, so it blocks until new events from the queue.
- Remove process checking from the main loop.
Or we can come up with better solution.
Do we know what causes this to happen? I've been experiencing it fairly often and I usually have to kill -9 the processes.
Actually it's a mystery for me. Most probably it happens because workers start before singral interceptors are called. Because I only experienced it when I was sending "^C" signal before the startup sequence was complete.