bullmq icon indicating copy to clipboard operation
bullmq copied to clipboard

Missing lock for job failed errors on repeatable jobs

Open loris opened this issue 4 years ago • 12 comments

Hello, we see once in a while (here its 13 times in 7 days), error Missing lock for job repeat:...:... failed. FYI, the following repeat jobs are from different queues and have different handlers, but they are all very lightweight in term of workload (doing some fast mongo query every minute to check for any work to enqueue in bull). But, they tend to happen more often when our servers are doing lots of other stuff (ie, handling millions of other bull jobs). Not sure if we should tweak some lockDuration or other property here

image

Also note, that I checked other issues regarding "missing lock" errors and this one looks apart (it does not happen when exiting the process, and only concerns repeat jobs)

Any idea?

loris avatar Jul 09 '21 09:07 loris

@loris Im facing similar issue, my queue is very simple with only 1 worker, did your job get executed by the worker and then error appears like in my case?

https://github.com/taskforcesh/bullmq/issues/596#issuecomment-873129004

allandiego avatar Jul 11 '21 10:07 allandiego

I have the same issue I realize it was because of the keyPrefix in redis connection. just remove it.

amirAlamian avatar Aug 03 '21 13:08 amirAlamian

I don't use keyPrefix (only the default prefix queue options value, which is set to bull). Moreover, the errors in my case only happen occasionally, they throw for cron jobs which run every minute or so, but the errors only throw every few days. However, we have been using bull for a very long time (bull v2 probably), and these cron might have been created with now incompatible values, any hint @manast ?

loris avatar Aug 04 '21 08:08 loris

Hi, I can confirm removing keyPrefix from IORedis connection it works but I have several applications pointing at the same redis cluster and I need to specify the keyPrefix.

webfrank avatar Aug 06 '21 13:08 webfrank

@webfrank you do not need to use the redis prefix just pick a different prefix in BullMQ (default is "bull")

manast avatar Aug 06 '21 15:08 manast

@loris I cannot say if the errors are due to older version of the cron jobs but I do not think so, but it is quite suspicious that it happens with cron jobs but not standard jobs.

manast avatar Aug 06 '21 15:08 manast

@webfrank you do not need to use the redis prefix just pick a different prefix in BullMQ (default is "bull")

Hi, I created a different client without keyPrefix and changed prefix in Queue and Worker but I prefer to use only one redis instance.

webfrank avatar Aug 06 '21 15:08 webfrank

@webfrank the problem is that the prefix needs to be applied internally in lua scripts too. So design wise it is better to let BullMQ handle all the prefixing, otherwise the code to handle ioredis auto prefiix and BullMQ prefix will be too complicated an error prone. I think the best now would actually raise an exception in BullMQ if ioredis has the prefix option enabled.

manast avatar Aug 07 '21 09:08 manast

Hi, I can confirm removing keyPrefix from IORedis connection it works but I have several applications pointing at the same redis cluster and I need to specify the keyPrefix.

you can create different connections for every part of your application. one with keyPrefix and other without.

amirAlamian avatar Aug 07 '21 13:08 amirAlamian

For anyone looking for context, here is the connection docs: https://docs.bullmq.io/guide/connections. The connection can either be a { host, port, db, ... } object or an IORedis object. It is IORedis object that allows the keyPrefix which BullMQ warns against:

When using ioredis connections, be careful not to use the "keyPrefix" option in ioredis as this option is not compatible with BullMQ, which provides its own key prefixing mechanism.

ianchanning avatar Oct 22 '24 16:10 ianchanning

Both of these are relevant here too:

  • https://github.com/taskforcesh/bullmq/issues/489
  • https://github.com/OptimalBits/bull/issues/1591

ianchanning avatar Oct 23 '24 12:10 ianchanning

@loris did you find a solution to this issue? I'm also running into the same issue, with just 1 worker and default concurrency.

Also note: I'm on python bullmq - v2.11.0

akabeera avatar Jan 07 '25 10:01 akabeera