heroku-buildpack-python icon indicating copy to clipboard operation
heroku-buildpack-python copied to clipboard

Set REMAP_SIGTERM=SIGQUIT to prevent Celery task loss on dyno restarts

Open edmorley opened this issue 5 years ago • 1 comments

When a Heroku dyno is restarted (eg daily restart or deploy), all processes on the dyno are sent a SIGTERM signal prior to being shutdown, to allow them time to gracefully exit: https://devcenter.heroku.com/articles/dynos#graceful-shutdown-with-sigterm

However when using Celery, this SIGTERM signal results in Celery's worker processes being shutdown before the master process has had a chance to send unfinished tasks back to the broker (since SIGTERM is the signal used internally by the master process to instruct the workers to shutdown). This causes tasks to be lost.

As such, a new feature was added to Celery 4, which allows customisation of the signal used internally by Celery (via the environment variable REMAP_SIGTERM), so that it no longer clashes with the SIGTERM sent by Heroku.

See: https://github.com/celery/celery/issues/2839 https://devcenter.heroku.com/articles/celery-heroku#using-remap_sigterm

If users miss the mention in the devcenter article and don't set the environment variable, they will lose tasks upon deployments and daily dyno restarts. At my previous employer we regularly experienced lost tasks, which were eventually tracked down to this issue (at that point we were stuck on Celery 3 which didn't support REMAP_SIGTERM).

As such it seems like this buildpack should set REMAP_SIGTERM=SIGQUIT automatically for the user when using Celery (if it's not already set in their environment).

In fact, the environment variable doesn't appear to be used by anything other than Celery (and its internal library, billiard), so it may be safe to enable globally without a Celery conditional (similar to how FORWARDED_ALLOW_IPS is configured regardless of the presence of gunicorn).

Tracked internally at W-8133993.

edmorley avatar Apr 10 '19 09:04 edmorley