[Bug]: BullMq jobs taking too much space in redis
Version
v5.12.0
Platform
NodeJS
What happened?
I have an app in production, and we have almost 4.5 million jobs delayed and 45K repeating jobs right now. The issue is that the space they are taking are almost 10gb in memory, which is strange. I am expecting around 20 million jobs but right now the space it's taking it's not possible to scale that much vertically. Can you let me know what could be the reason or am I doing something wrong?
How to reproduce.
Add 4.5 million delayed jobs in queue
Relevant log output
No response
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
Same here since we've upgraded from v5.10.3 to v5.12.10 (a huge increase in memory)
hi folks, the only recent changes that involves new keys are the new repeatable structure: when you create new repeatable jobs a new key for a repeatable job is saved that includes the repeatable options. And the second one is debouncing support: a new debounce key is added only when debounce option is passed. 2 questions:
- are adding new repeatable jobs since v5.10.0? better to use https://github.com/taskforcesh/bullmq/blob/master/docs/gitbook/changelog.md#5104-2024-07-26 to include last fixes regarding the new structure
- are you using the new debounce logic?
hi folks, the only recent changes that involves new keys are the new repeatable structure: when you create new repeatable jobs a new key for a repeatable job is saved that includes the repeatable options. And the second one is debouncing support: a new debounce key is added only when debounce option is passed. 2 questions:
- are adding new repeatable jobs since v5.10.0? better to use https://github.com/taskforcesh/bullmq/blob/master/docs/gitbook/changelog.md#5104-2024-07-26 to include last fixes regarding the new structure
- are you using the new debounce logic?
Thank you for your answer
Unfortunately we are not using repeatable jobs nor debounce logic. We use the wrapper @nestjs/bullmq, we have to check if is related to an upgrade, or other contexts.
Edited: We found the problem, we sent too many requests with quite bloated payloads, which after some time increase the workload exponentially to our redis server. Reduce the payload and the the lifetime of completed jobs helped in our case. Sorry for the misleading comments
hey @Haris-SP4RKy could you also inspect you redis keys, if there are any key containing :repeat: or :de:?
hi folks, the only recent changes that involves new keys are the new repeatable structure: when you create new repeatable jobs a new key for a repeatable job is saved that includes the repeatable options. And the second one is debouncing support: a new debounce key is added only when debounce option is passed. 2 questions:
- are adding new repeatable jobs since v5.10.0? better to use https://github.com/taskforcesh/bullmq/blob/master/docs/gitbook/changelog.md#5104-2024-07-26 to include last fixes regarding the new structure
- are you using the new debounce logic?
No we are not using any debounce logic we are using repeatable jobs and delayed jobs but the volune of job is quite good
hi folks, the only recent changes that involves new keys are the new repeatable structure: when you create new repeatable jobs a new key for a repeatable job is saved that includes the repeatable options. And the second one is debouncing support: a new debounce key is added only when debounce option is passed. 2 questions:
- are adding new repeatable jobs since v5.10.0? better to use https://github.com/taskforcesh/bullmq/blob/master/docs/gitbook/changelog.md#5104-2024-07-26 to include last fixes regarding the new structure
- are you using the new debounce logic?
Thank you for your answer
Unfortunately we are not using repeatable jobs nor debounce logic. We use the wrapper @nestjs/bullmq, we have to check if is related to an upgrade, or other contexts.
Edited: We found the problem, we sent too many requests with quite bloated payloads, which after some time increase the workload exponentially to our redis server. Reduce the payload and the the lifetime of completed jobs helped in our case. Sorry for the misleading comments
We are also using @nestjs/bullmq and also have implemented logic for just keeping 100 successful and 100 failed jobs and not more than 7 days and the payload of the job is no more than two UUID v4
@Haris-SP4RKy just to clean up a bit the context here. Are you experiencing this large memory consumption after an upgrade of BullMQ or is it unrelated to the BullMQ version used?
hey @Haris-SP4RKy could you also inspect you redis keys, if there are any key containing
:repeat:or:de:?
bull:TASK_NOTIFICATION_QUEUE:repeat:34bbc9e9064ac54259859101ce9f0b23:1725382800000 This is one of the example key it has repeat
@Haris-SP4RKy you can also use this command to inspect the memory consumption of a given key: https://redis.io/docs/latest/commands/memory-usage/ This could help you find which key or keys is consuming more than expected.
@Haris-SP4RKy just to clean up a bit the context here. Are you experiencing this large memory consumption after an upgrade of BullMQ or is it unrelated to the BullMQ version used?
We started from the latest version, so I don't know what is the actual reason or how it would look in previous version
@Haris-SP4RKy you can also use this command to inspect the memory consumption of a given key: https://redis.io/docs/latest/commands/memory-usage/ This could help you find which key or keys is consuming more than expected.
@Haris-SP4RKy you should check the keys that include the "id" as postfix, as those are the actual jobs
The would look something like this:
"bull:test-39244dbe-5287-4d5b-9815-964ee48acc53:id"
You can also check the contents of the events key, as for example this one "bull:test-0dc5b09d-2727-4595-8e43-37e7dc366bcb:events"
@manast I have only keys that have repeat in it, no others keys other than that. there are no keys that have id as postfix
bull:DAILY_NOTIFICATION_QUEUE:repeat:616f4e64f45c20b7485be92e8a06a9c3
bull:DAILY_NOTIFICATION_QUEUE:repeat:d4000a6b6778333f9c251f35877a9d9c:1725408000000
bull:CLASS_NOTIFICATION_QUEUE:USERS:NOTIFICATION:CLASS:00402233-d2ca-47a6-8d73-3717f73b1a78:82ccf8ea-72b3-478f-a372-f06d5f2deca7
All keys are of these structures. Last one is for delayed job, others two from repeatable
We're having similar issues. All of a sudden redis runs out of memory. We're thinking this may be a build up of stalled jobs, since a quick server restart usually solves the issues (we don't need to go and manually force delete all keys in redis).
@magnusburton stalled or not, jobs take always the same amount of memory. When jobs are in different statuses, we only store the job id, which by default is a increasing integer that does not consume a lot of memory.