bullmq icon indicating copy to clipboard operation
bullmq copied to clipboard

Possible Memory Leak - heapdumps show continuous growth in compiled code being retained by dummy job

Open garrettg123 opened this issue 10 months ago • 8 comments

Version

v5.39.1

Platform

NodeJS

What happened?

As seen in the heap dump comparison after ~100k dummy job completions, there is compiled code that looks like Redis commands being retained. Tested with concurrency = 1 and 5, limited to 1 or 100 per second. Image

How to reproduce.

new Worker(
      myQueue,
      job => {
        // empty
      },
      {
        connection: new IORedis({
      host: env.REDIS_HOST || 'localhost',
      port: Number(env.REDIS_PORT) || 6379,
      maxRetriesPerRequest: null,
    }),
        concurrency: 5,
        limiter: {
          max: 100,
          duration: 1000,
        },
      }
    )

Relevant log output

When tracking metrics, it appears that only 10% get drained:

  Worker stats: {
    workers: 1,
    active: 157430,
    progress: 0,
    completed: 157430,
    stalled: 0,
    failed: 0,
    errored: 0,
    drained: 1575,
  }


This was calculated using the handlers:

worker.on('completed', job => {
  WORKER_STATS.completed++
})

worker.on('failed', (job, error) => {
  WORKER_STATS.failed++
})

worker.on('error', error => {
  WORKER_STATS.errored++
})

worker.on('progress', (job, progress) => {
  WORKER_STATS.progress++
})

worker.on('stalled', job => {
  WORKER_STATS.stalled++
})

worker.on('active', job => {
  WORKER_STATS.active++
})

worker.on('drained', () => {
  WORKER_STATS.drained++
})

Code of Conduct

  • [x] I agree to follow this project's Code of Conduct

garrettg123 avatar Jan 31 '25 07:01 garrettg123

Can you produce the complete source code, as the one that you provided is not enough as it is not adding any jobs and so on.

manast avatar Jan 31 '25 09:01 manast

Of course:

import { Queue, Worker } from 'bullmq'

// producer
const myQueueName = 'my-queue'
const queue = new Queue(QUEUE_NAME.PROCESS_KEYWORD_MATCH, {
  connection: new IORedis({
    host: env.REDIS_HOST || 'localhost',
    port: Number(env.REDIS_PORT) || 6379,
    maxRetriesPerRequest: null,
  }),
  defaultJobOptions: {
    removeOnComplete: true,
  },
})

setInterval(() => queue.add(myQueueName, {}), 100)

// worker
new Worker(
  myQueueName,
  job => {
    // empty
  },
  {
    connection: new IORedis({
      host: env.REDIS_HOST || 'localhost',
      port: Number(env.REDIS_PORT) || 6379,
      maxRetriesPerRequest: null,
    }),
    concurrency: 5,
    limiter: {
      max: 100,
      duration: 1000,
    },
  }
)

garrettg123 avatar Jan 31 '25 17:01 garrettg123

Btw, jobs are not "drained", the drained event is generated when the queue is empty, i.e. there are no jobs to be processed.

manast avatar Feb 03 '25 14:02 manast

Furthermore, by definition this code is going to generate a leak:

{
    connection: new IORedis({
      host: env.REDIS_HOST || 'localhost',
      port: Number(env.REDIS_PORT) || 6379,
      maxRetriesPerRequest: null,
    }),

As you are passing an instance of IORedis that you are never closing (BullMQ would only close connections created by itself).

manast avatar Feb 03 '25 14:02 manast

I am running this code which creates something like 1k jobs per second, and I keep it running for some time, like 10 minutes or so, running the garbage collector from time to time and the memory is quite stable at around 10-11Mb. So I am going to need more proof that there is indeed a memory leak in this code.

import { Queue, Worker } from "bullmq";

const queueName = "test-leaks";

// producer
const queue = new Queue(queueName, {
  connection: {
    host: "localhost",
    port: 6379,
    maxRetriesPerRequest: null,
  },
  defaultJobOptions: {
    removeOnComplete: true,
  },
});

setInterval(() => queue.add(queueName, {}), 1);

// worker
new Worker(
  queueName,
  (job) => {
    // empty
  },
  {
    connection: {
      host: "localhost",
      port: 6379,
      maxRetriesPerRequest: null,
    },
    concurrency: 5,
    limiter: {
      max: 100,
      duration: 1000,
    },
  }
);

manast avatar Feb 03 '25 14:02 manast

Here some proof that there are no leaks, after 15 minutes running I took a new heapshop and all the allocations produced between 1 and 2 (processed thousands of jobs before those two heap shots) where almost none (just 29kb which most likely is some GC collection delay):

Image

manast avatar Feb 03 '25 14:02 manast

Now, there could be a leak, but it is too small to be debugged unfortunately. This is the state of NodeJS ecosystem, some leaks you must live with as long as they are small enough we do not have tools to guarantee 100% that there are no leaks.

manast avatar Feb 03 '25 14:02 manast

Reopened in case the author can provide more evidence.

manast avatar Feb 03 '25 14:02 manast

hi @garrettg123, could you pls upgrade your version since https://github.com/taskforcesh/bullmq/blob/master/docs/gitbook/changelog.md#5542-2025-06-17 and let us know

roggervalf avatar Aug 29 '25 03:08 roggervalf

feel free to reopen it if you hace more insigths

roggervalf avatar Oct 27 '25 05:10 roggervalf