dramatiq
dramatiq copied to clipboard
All workers terminate when one worker crashes
Issues
To start with, thanks for the authors of this amazing project! We are experimenting with it as a possible replacement for celery.
Some of our tasks may result in segfaults or may be killed by the OS for consuming too much memory. The current behaviour of dramatiq is to close all workers in case a single worker gets "lost".
This is done over here:
https://github.com/Bogdanp/dramatiq/blob/master/dramatiq/cli.py#L579
Why are all workers terminated? My (naive) expectation would be: only restart the worker that crashed.
Checklist
- [ ] Does your title concisely summarize the problem?
- [ ] Did you include a minimal, reproducible example?
- [ ] What OS are you using?
- [ ] What version of Dramatiq are you using?
- [ ] What did you do?
- [ ] What did you expect would happen?
- [ ] What happened?
What OS are you using?
Ubuntu 22.04
What version of Dramatiq are you using?
1.13.0
What did you do?
@dramatiq.actor
def crash():
import ctypes
ctypes.string_at(0) # segfault
crash.send()
What did you expect would happen?
The specific worker would restart. Ideally, the task itself is not rescheduled, but this is probably not feasible?
What happened?
All worker processes shutdown. In my case, supervisor restarts the dramatiq
.