hasjob
hasjob copied to clipboard
Evaluate move from RQ to Celery
Redis Queue has served us well as a background job framework, but we keep feeling like we need something more:
- Time delayed jobs (run this job after ten minutes)
- Aggregate jobs (run this job, but if we already have something like this queued, just give it one more parameter)
- Multi-lingual jobs (this is better done in JS, so instead of running JS inside a Python wrapper, why not have a native JS worker?)
- Multi-machine jobs (can we offload this task to another machine with lower load?)
- Brokers other than Redis (can Redis do a cluster? Are we stuck with a single broker machine?)
- Better failure reporting (like the email/SMS log mechanism we use in coaster.logging)
While RQ has been fantastic so far, we can't help wonder:
- Should we move to Celery so that we have a more robust and future-proof framework, or
- Are we over engineering this and is RQ good enough for the foreseeable future?
This issue is for the discussion on which way to go.
RQ does not work on Windows as it requires fork()
. This does not concern us as we do not support Windows for either development or deployment.
RQ was inspired by Celery but with the explicit goal of being simpler, and has over time acquired good ideas from Celery.
This Stack Overflow thread on Celery vs RQ suggests Celery has a bit of a learning curve but is straightforward after that.
A collection of notes on task queue frameworks in Python, including web services. Does not pass judgement but gives you material to read.
Celery supports scheduled jobs and can be used as a cron replacement. This video documents how.
This Quora thread says RQ has a cleaner approach to (a) handling failed tasks and (b) setting task priority (by using multiple queues).
These people moved from Celery to RQ because understanding how it worked was critical to understanding how their code worked overall. This one is the biggest fear holding us back from moving to Celery.
Resolved: we're staying with RQ, but considering other RPC/async mechanisms for the task RQ doesn't do yet: collecting results from multiple priority background jobs.
WRT 1, you can use rq-scheduler for scheduled, periodic, or repeated tasks at the cost of running a scheduler process.
Fantastic! Giving this a spin.