django-tasks
django-tasks copied to clipboard
Support autoreload of `db_worker`
I was testing this package and created a Django admin action to trigger the enqueue of a task. After getting everything working, I changed the code to modify the task result. However, after queuing the task, I still got the old result.
My guess is that when running the db_worker, you need to restart the command if you change the code. Is this expected behavior? I was surprised because I'm used to Django's runserver restarting on code changes. Could be a good idea to document this in the README?
You would have to rewrite the db_worker command to use Django's autoreload. It's not default Python behavior.
This is indeed expected behaviour, although not "intended" necessarily. Only Django's runserver command reloads when the code does. Other long-running management commands don't.
I agree it'd be a nice feature to have, but the behaviour could be quite complex. For example, what happens if a task is running?
You would have to rewrite the
db_workercommand to use Django's autoreload. It's not default Python behavior.
I mainly use Python with Django, which led to my proposal for adding this to the documentation. Given that autoreload is not default Python behavior, I agree that my suggestion doesn't make sense.
This is indeed expected behaviour, although not "intended" necessarily. Only Django's
runservercommand reloads when the code does. Other long-running management commands don't.
Thinking about this, Django runserver is supposed to be used only in development, but db_worker is for development and production environments, isn't it? Maybe the autoreload for production does not make a lot of sense.
I agree it'd be a nice feature to have, but the behaviour could be quite complex. For example, what happens if a task is running?
I guess these problems are similar to the ones stated in the issue https://github.com/RealOrangeOne/django-tasks/issues/46.
A feature such as this needs to have a big caveat: only to be used in development.
The reuslting overhead would be large in a production env (ie. do you really need to install another dependency such as pywatchman in prod) and the utility of having an autoreload functionality would not provide much value imo.
For example, what happens if a task is running?
If we assume this only used for local development, we can consider the stakes low. In that case I would suggest a pessimistic approach: the task is terminated along with the worker, and a noack of the task results in a reprocessing of the task when the worker spins up again.
Hooking into Django's existing reload system used for runserver and restarting db_worker ought not to be too complex, so long as it's handled properly.
I agree making this development only (or at least DEBUG only) does solve a lot of problems.
Maybe just a command line option is enough to fix this issue?
python manage.py db_worker --autoreload - The opposite way as runserver's --noreload
For development, I don't see a reason why this needs to be opt-in functionality. Might as well make it the default.
Are you suggesting that when Debug = True, that auto reload is activated? And the only way to deactivate the auto reload in local/dev, is to pass a --noreload flag?
It'd need some thought, as to how currently-running tasks are handled. I suspect the solution is to reload after the current task (so it's a graceful reload).
Here's an example command to make this work while there's no implementation:
import shlex
import subprocess
from django.core.management.base import BaseCommand
from django.utils import autoreload
def restart_worker():
db_worker_cmd = "python manage.py db_worker"
cmd = f'pkill -f "{db_worker_cmd}"'
subprocess.call(shlex.split(cmd))
subprocess.call(shlex.split(f"{db_worker_cmd}"))
class Command(BaseCommand):
def handle(self, *args, **options):
print("Starting db worker with autoreload...")
autoreload.run_with_reloader(restart_worker)
I am running this command to start "runserver" (dev server) and the "db_worker" in parallel:
import os
import signal
from django.core.management.commands.runserver import Command as RunserverCommand
from django.db import connections
from django_tasks.backends.database.management.commands.db_worker import Worker
class Command(RunserverCommand):
help = "Run Django's dev server and the db_worker in parallel with autoreload."
def handle(self, *args, **options):
pid = os.fork()
print('PID: ', pid)
if pid == 0:
connections.close_all()
self._run_worker()
else:
try:
super().handle(*args, **options)
finally:
os.kill(pid, signal.SIGTERM)
os.waitpid(pid, 0)
def _run_worker(self):
try:
print('WORKER STARTING')
worker = Worker(
queue_names=["default"],
interval=1,
batch=False,
backend_name="default",
startup_delay=True,
)
worker.start()
finally:
print('WORKER FINISHED')
This command also gives me free auto-reload for both, the dev server and the db_worker.