rq-scheduler icon indicating copy to clipboard operation
rq-scheduler copied to clipboard

Architecture question: Multiple rqscheduler processes?

Open jmmills opened this issue 10 years ago • 6 comments

I was wondering if this product has some built in ability to deal with multiple scheduler processes?

In the case of RQ workers distributed across multiple hosts with clustered Redis, one would have a pretty fault tolerant system, for example if a whole machine goes down - jobs still get processed.

If however, some of these jobs are scheduled jobs and the particular host goes down that was running the scheduler... scheduled jobs would then not get run until another schedule was started (jobs are recovered at least).

But what about running multiple schedulers? Thus a highly available system.

jmmills avatar Feb 17 '15 19:02 jmmills

These is a similar bug open here: https://github.com/ui/rq-scheduler/pull/62

The way I solved this was by making my own rqscheduler script (I already need to subclass rq_scheduler.Scheduler for a few things) that tries to register itself in a loop, like this:

def main():
    sched = FooScheduler(connection=get_redis_client(), interval=SCHEDULER_INTERVAL_SECONDS) #this is my subclassed rq_scheduler.Scheduler object
    while True:
        try:
            sched.run()
            break
        except ValueError, exc:
            if exc.message == "There's already an active RQ scheduler":
                sched.log.debug(
                    "An RQ scheduler instance is already running. Retrying in %d seconds.",
                    SCHEDULER_INTERVAL_SECONDS,
                )
                time.sleep(SCHEDULER_INTERVAL_SECONDS)
            else:
                raise

if __name__ == "__main__":
    main()

Then I run my rqscheduler script instead of the builtin one. This way I can do rolling restarts of my rq-scheduler processes, the new one will automatically wait until the old one dies.

It would be nice to see this behavior builtin to rq-scheduler, as I explained in this comment: https://github.com/ui/rq-scheduler/pull/62#issuecomment-64453316

lost-theory avatar Feb 17 '15 20:02 lost-theory

Yes, let's build this into rq-scheduler. See my comment here: https://github.com/ui/rq-scheduler/pull/62#issuecomment-74796551

selwin avatar Feb 20 '15 01:02 selwin

@darkpixel has an interesting suggestion. Allow multiple schedulers to run, but each scheduler has to acquire a lock when scheduling jobs. I think this is a good solution to people who want to run multiple scheduler processes for reliability purposes.

If someone can make a pull request for this, I would be happy to accept this :)

selwin avatar Aug 19 '15 23:08 selwin

If a redis lock is used, how to ensure that a crashed scheduler doesn't cause a stale lock? Maybe a simple keep alive via redis pub/sub?

On Aug 19, 2015, at 4:43 PM, Selwin Ong [email protected] wrote:

@darkpixel has an interesting suggestion. Allow multiple schedulers to run, but each scheduler has to acquire a lock when scheduling jobs. I think this is a good solution to people who want to run multiple scheduler processes for reliability purposes.

If someone can make a pull request for this, I would be happy to accept this :)

— Reply to this email directly or view it on GitHub.

jmmills avatar Aug 20 '15 04:08 jmmills

We can use "redis.expire(30)" so that if scheduler crashes, the lock will still be expired by Redis :)

Sent from my phone

On Aug 20, 2015, at 11:26 AM, Jason Mills [email protected] wrote:

If a redis lock is used, how to ensure that a crashed scheduler doesn't cause a stale lock? Maybe a simple keep alive via redis pub/sub?

On Aug 19, 2015, at 4:43 PM, Selwin Ong [email protected] wrote:

@darkpixel has an interesting suggestion. Allow multiple schedulers to run, but each scheduler has to acquire a lock when scheduling jobs. I think this is a good solution to people who want to run multiple scheduler processes for reliability purposes.

If someone can make a pull request for this, I would be happy to accept this :)

— Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub.

selwin avatar Aug 20 '15 04:08 selwin

Ah, that works. A deadman switch.

On Aug 19, 2015, at 9:32 PM, Selwin Ong [email protected] wrote:

We can use "redis.expire(30)" so that if scheduler crashes, the lock will still be expired by Redis :)

Sent from my phone

On Aug 20, 2015, at 11:26 AM, Jason Mills [email protected] wrote:

If a redis lock is used, how to ensure that a crashed scheduler doesn't cause a stale lock? Maybe a simple keep alive via redis pub/sub?

On Aug 19, 2015, at 4:43 PM, Selwin Ong [email protected] wrote:

@darkpixel has an interesting suggestion. Allow multiple schedulers to run, but each scheduler has to acquire a lock when scheduling jobs. I think this is a good solution to people who want to run multiple scheduler processes for reliability purposes.

If someone can make a pull request for this, I would be happy to accept this :)

— Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub.

jmmills avatar Aug 20 '15 04:08 jmmills