kue-scheduler icon indicating copy to clipboard operation
kue-scheduler copied to clipboard

deadlock unique every job

Open jvdgrift opened this issue 7 years ago • 2 comments

I'm currently faced with an deadlock issue on our production environment: a single unique job is started every day at 03:30 AM... that schedule is triggered but all 6 of our workers are reporting a lock error (deadlock) and the job is not run on any of the node servers.

03:30:01.143 Lock error: LockError: Exceeded 0 attempts to lock the resource 03:30:01.132 Lock error: LockError: Exceeded 0 attempts to lock the resource 03:30:01.132 Lock error: LockError: Exceeded 0 attempts to lock the resource 03:30:01.130 Lock error: LockError: Exceeded 0 attempts to lock the resource 03:30:01.089 Lock error: LockError: Exceeded 0 attempts to lock the resource 03:30:01.087 Lock error: LockError: Exceeded 0 attempts to lock the resource

I also switched to the kue-scheduler master (with the retry count of 3) but same result.

We use ioredis and redis with sentinels on production...

Running the code @localhost with 6 node instances and 1 redis (no sentinels) starts 1 unique job (no deadlock).

Any idea's what could be wrong?

The job once was non-unique and we needed that fixed... but now we have this issue of the deadlock. Could it be that the information stored in redis is mixed up? What needs cleaning up?

Side note: when I tested localhost with 6 node processes and the job executed took a very small time (< then the acquire lock time... It just logged a statement and returned) then the unique job got executed 4 times instead of once. The lock got released and other acquire locks still ran I suppose and got hold of a lock and also executed the job.

jvdgrift avatar Nov 23 '16 15:11 jvdgrift

Oke... the value of the unique of the job in question was pointing to a non-existing job. removing the unique solved it... I think we removed the completed job in kue-ui and that probably doesn't clean up the unique of kue-scheduler? perhaps the code could check this and if unique is non-exisiting to just create a new job?

jvdgrift avatar Nov 24 '16 12:11 jvdgrift

@jvdgrift I will appreciate a pull request to allow the checkup

lykmapipo avatar Nov 24 '16 12:11 lykmapipo