[Feature] Do not allow removing a delayed job that is part of a repeatable meta job.
Is your feature request related to a problem? Please describe. If a delayed job that is representing the next iteration in a repeatable job gets deleted by mistake, the repeatable job will stop repeating with no other solution than re-adding the repeatable job again.
Describe the solution you'd like An exception or error should be raised if we try to remove a delayed job using the method for removing jobs or any other that could indirectly remove a delayed job. The only way to remove a delayed job that is part of a repeatable job should be by removing the repeatable job alltogeher.
Describe alternatives you've considered None
Additional context A user reported that their repeatable jobs stopped working and the delayed job associated to the repeatable job was missing. We should make sure that this does not happen by a mistake of the user in any case.
I was able to reproduce the problem, and I’m sharing the steps I followed below as I think they might help with the solution:
- A repeated job was created.
- The job was successfully completed 3 times.
- For a reason we don’t understand, after the 3rd job was completed, a ‘drained’ event occurred. As a result, the repeated job in ‘delayed’ is being deleted. Count of completed jobs changing sometimes it completes 6-10 jobs, but at the end removes repeated job.
Additionally, the situation on the Redis key side is as follows:
I wonder, what is the settings you are using for this repeatable job, like what cron expression or every interval?
btw, the "drained" event is just triggered when there are no jobs left in the queue, it is not the reason for the next delayed job to disappear or to not be created. Furthermore, the next iteration delayed job is created before the current job starts processing, so it should not matter what happens to the job that is processed, the next delayed job should be there. It would be interesting to see what repeat options you are using to see if we can spot something there.
Here's a sample repeat job. We use a different Redis for the sandbox. We were testing with just one job, so there wasn't anything in the delayed tab. Maybe that's why it was drained, but I still can't figure out why a repeatedJob would be drained when there's no delayed job.
{
"attempts": 0,
"delay": 59999,
"prevMillis": 1727846100000,
"timestamp": 1727846095466,
"repeat": {
"offset": 55465,
"key": "xx_66b4e50636054e50b75ba8f9",
"every": 60000,
"count": 2914
},
"removeOnFail": {
"count": 100
},
"jobId": "repeat:xx_66b4e50636054e50b75ba8f9:1727846100000",
"removeOnComplete": {
"count": 100
}
}
By any chance, can you reproduce this problem locally?
I also wonder if you are doing things like updating repeatable jobs options or data.
We have refactored and improved repeatable jobs into what we now call "Job Schedulers". They work the same as before but the API is cleaner and more robust. Also we have added guardrails so that you cannot easily remove a delayed job that belongs to a job scheduler by mistake. Here is the new documentation: https://docs.bullmq.io/guide/job-schedulers
I recommend you to upgrade to these new methods, and lets see if you suffer from this issue again after this.
This is now completed. If you still have an issue lets open a new issue as this feature of not allowing delayed jobs that are part of job scheduler is now ready.