asynq
asynq copied to clipboard
[QUESTION] Could multiple worker instances cause task duplication in the queue due to the recovery mechanism?
Hi @hibiken. I have a follow-up question regarding running multiple instances of workers that serve the same queue. As we have a recovery mechanism now, could there be a possibility that several recovery processes pick up and re-enqueue the same lease expired task multiple times (so the queue will have multiple entries of the same task ID)?
@WSUFan thanks for pointing this out.
I believe you're right, this needs to be fixed. Approaches I can think of:
- take a lock on these redis keys (asynq:{
}:lease) when running listLeaseExpired
command (lock with TTL to avoid unfortunate situation) - within
listLeaseExpired
, we could move these expired task ids to "stagin" area so that they are not in the asynq:{}:lease anymore
Any feedback is appreicated.
yeah, maybe a redis lock is needed when operating with these keys. I would choose approach 1, haha
While there is a race condition, is there an actual risk of task duplication?
Take 2 workers W1 and W2 and an expired task T1 and consider this timeline of sequential events
- W1 calls
ListLeaseExpired
and gets T1 - W2 calls
ListLeaseExpired
and gets T1 - W1 calls
retry
It removes T1 fromasynq:{<qname>}:active
andasynq:{<qname>}:lease
and moves it toasynq:{<qname>}:retry
- W2 calls
retry
but it will fail because callingLREM
onasynq:{<qname>}:active
return 0 elements.
Now of course both W1 and W2 having the same expired tasks is not ideal and there is a race condition where in the last step above, W2 might LREM
task T1 after it has been re-enqueued again.
So I think your proposal of a Redis lock around listLeaseExpired
is good solution so each worker is dealing with disjoint set of expired tasks
I believe another solution would be to have the client side manage this. We could execute the listLeaseExpired()
function on the client side and retrieve the task initiated by this client. In this scenario, there would only be one retry (re-enqueuing the same task) from this client, assuming each task has a unique ID.