work
work copied to clipboard
Lost job
Within the same queue:
- Job with id "X" is enqueued
- Job with id "X" is dequeued by worker №1
- Job with id "X" is enqueued
- Handler of worker №1 returns without error
- Worker №1 flushes ACK
- Job enqueued at step 3 is lost
Test in #67 demonstrates the issue.
What I'd expect here, is that ACK script should check if job X has been updated meanwhile, and if so, just make it "visible" again instead of deleting.
I am not sure why you will have job with the same id in step 3. The semantic of having the same id is to update the existing job.
For my use case, we have jobs repeatedly enqueued with same id, but only latest one is relevant, and it's critical that latest enqueued job with given ID will be processed. In my case issue can be fixed by not using deduplication at all (enqueuing with uuid), but this way we end up having no-op jobs, because we're only interested in latest one for given ID.
using uuid is a better choice. I don't think you should use job id like this because jobs can be executed out of order after retrying. there is a job dedup middleware that you should take a look.
The semantic of having the same id is to update the existing job.
- You can update existing job with id X
- You can create new job with id X if there is no such job in the queue (even if there was one before)
- You can update job with id X if job with same id is currently dequeued, but updated job will be silently discarded
That seems a bit inconsistent. It at least should return an error in the last case, so I know that my update won't take an effect. But I feel like more consistent outcome would be to allow it to run again after ACK, to guarantee that latest update will be digested, just like in the first two points above. Then we can provide a middleware that discards job X update if it's currently in progress, which should be easier to implement than other way around.
There is a FindJob method in queue that can be helpful to find whether there is still a job pending or being processed. If a job is found, then you can return an error on enqueue. I don't think such expensive coordination should exist at the core of this library. It can be done with job payload design or custom middleware, and I want to keep the core small.
When a job is enqueued, it could be pending or being processed (including subsequent ACK after it is done) by design. The owner of this job is not the user anymore, so you should not update it without proper coordination. On top of that, when a job X is acked, it is done and marked for deletion; updating a to-be-deleted job makes no sense. Instead of making complex coordination, the better way is to just create a new job.