kue icon indicating copy to clipboard operation
kue copied to clipboard

Inconsistencies in job's state

Open finalclass opened this issue 7 years ago • 2 comments

kue version: 0.11.6

I'm experiencing a weird phenomena with some of our jobs. I have jobs in {q}:jobs:active ZSET that have their state set to failed. I've tried to figure out how this is possible but I couldn't. My first suspect was that there was some external restart of the process during the job.state() function but the MUTLI is used there so it shouldn't cause any inconsistencies.

There is this queue.checkActiveJobTtl() mechanism that runs every second and in our case on some events we have a lot of these inconsistent jobs and these get processed every second which is causing an unnecessary load on our servers.

The simplest solution would be to add:

job._state = 'active';

here: https://github.com/Automattic/kue/blob/master/lib/kue.js#L245 however on one server I've noticed that we have inconsistency with jobs in the "incative" box (these are in inactive ZSET but their state is set to "failed")

finalclass avatar Oct 12 '18 15:10 finalclass

Finally I know what's the problem.

So it's the refreshTtl function that is putting these jobs back to active list: https://github.com/Automattic/kue/blob/master/lib/queue/job.js#L346

This refreshTtl function is called when progress is set. The thing is that we don't always wait for the progress callback to be called

So from time to time, a job finishes but later the progress (thus refreshTtl) runs and it adds the job back to active zset.

finalclass avatar Oct 15 '18 16:10 finalclass

Unfortunately refreshTtl and Job.prototype.progress do not accept callbacks so it's impossible to fix it on our side.

finalclass avatar Oct 16 '18 09:10 finalclass