work icon indicating copy to clipboard operation
work copied to clipboard

Lost job

Open josephbuchma opened this issue 3 years ago • 6 comments

Within the same queue:

  1. Job with id "X" is enqueued
  2. Job with id "X" is dequeued by worker №1
  3. Job with id "X" is enqueued
  4. Handler of worker №1 returns without error
  5. Worker №1 flushes ACK
  6. Job enqueued at step 3 is lost

Test in #67 demonstrates the issue.

josephbuchma avatar Nov 12 '21 16:11 josephbuchma

What I'd expect here, is that ACK script should check if job X has been updated meanwhile, and if so, just make it "visible" again instead of deleting.

josephbuchma avatar Nov 12 '21 16:11 josephbuchma

I am not sure why you will have job with the same id in step 3. The semantic of having the same id is to update the existing job.

taylorchu avatar Nov 12 '21 18:11 taylorchu

For my use case, we have jobs repeatedly enqueued with same id, but only latest one is relevant, and it's critical that latest enqueued job with given ID will be processed. In my case issue can be fixed by not using deduplication at all (enqueuing with uuid), but this way we end up having no-op jobs, because we're only interested in latest one for given ID.

josephbuchma avatar Nov 12 '21 19:11 josephbuchma

using uuid is a better choice. I don't think you should use job id like this because jobs can be executed out of order after retrying. there is a job dedup middleware that you should take a look.

taylorchu avatar Nov 12 '21 22:11 taylorchu

The semantic of having the same id is to update the existing job.

  • You can update existing job with id X
  • You can create new job with id X if there is no such job in the queue (even if there was one before)
  • You can update job with id X if job with same id is currently dequeued, but updated job will be silently discarded

That seems a bit inconsistent. It at least should return an error in the last case, so I know that my update won't take an effect. But I feel like more consistent outcome would be to allow it to run again after ACK, to guarantee that latest update will be digested, just like in the first two points above. Then we can provide a middleware that discards job X update if it's currently in progress, which should be easier to implement than other way around.

josephbuchma avatar Nov 13 '21 07:11 josephbuchma

There is a FindJob method in queue that can be helpful to find whether there is still a job pending or being processed. If a job is found, then you can return an error on enqueue. I don't think such expensive coordination should exist at the core of this library. It can be done with job payload design or custom middleware, and I want to keep the core small.

When a job is enqueued, it could be pending or being processed (including subsequent ACK after it is done) by design. The owner of this job is not the user anymore, so you should not update it without proper coordination. On top of that, when a job X is acked, it is done and marked for deletion; updating a to-be-deleted job makes no sense. Instead of making complex coordination, the better way is to just create a new job.

taylorchu avatar Nov 13 '21 20:11 taylorchu