Possible memory leak using experimental feature Execution Cancellation
Issue workflow progress
Progress of the issue based on the Contributor Workflow
- [ ] 1. The issue provides a reproduction available on Github, Stackblitz or CodeSandbox
Make sure to fork this template and run
npm run generatein the terminal.Please make sure the GraphQL Tools package versions under
package.jsonmatches yours. - [ ] 2. A failing test has been provided
- [ ] 3. A local solution has been provided
- [ ] 4. A pull request is pending review
Describe the bug
While analyzing a possible memory leak in a graphql-yoga server with Node.js heap profiling and Chrome dev-tools I saw some suspicious results which could indicate a memory leak in @graphql-tools/executor.
The first screen shot shows the comparison of two heap snapshots of the service, the latter one (Snapshot 4) received about 10000 mutation requests more than Snapshot 3. Memory is very stable except some new allocations, see the red frame. The number of new allocations is a multiple (x2, x4 or x6) of the received requests.
If I drill down the source of the allocations, it points to this place (@graphql-tools/executor/cjs/execution/execute.js):
To Reproduce Steps to reproduce the behavior:
If these are really some leaking event listeners every graphql-yoga server using @graphql-tools/executor should leak a little memory over time.
Expected behavior
No leaking listeners
Environment:
- OS: Arch Linux, Kernel 6.17
-
@graphql-tools/...: - "@graphql-tools/executor-http": "3.0.4",
- "@graphql-tools/merge": "9.1.1",
- "@graphql-tools/schema": "10.0.25",
- "@graphql-tools/stitch": "10.0.2",
- "@graphql-tools/utils": "10.9.1",
- "@graphql-tools/wrap": "11.0.1",
- "graphql": "16.11.0",
- "graphql-fields": "2.0.3",
- "graphql-scalars": "1.24.2",
- "graphql-tag": "2.12.6",
- "graphql-yoga": "5.16.0",
- NodeJS: 24.9.0
Additional context
Sorry that I didn´t provide tests or a reproduction repo (yet). Tests may be very difficult to write and if there is really a leaking listener problem one should be able to reproduce the issue (with Node.Js heap profiler and Chrome dev tools heap snapshot comparison) with every program using @graphql-tools/executor.
Maybe I am totally wrong here and the behavior I see in the snapshot comparison is normal.
First of all, thank you for creating the issue! It would be great if we can get some bare minimum setup that reproduces the issue because we have different test suites to check any potential memory leaks both in this repo and other projects that are using Yoga; Tools -> https://github.com/ardatan/graphql-tools/blob/master/.github/workflows/tests.yml#L115 Yoga -> https://github.com/graphql-hive/graphql-yoga/blob/main/.github/workflows/ci.yml#L128 Hive Gateway -> https://github.com/graphql-hive/gateway/blob/main/.github/workflows/memtest.yml
About that event listener,
As you can see in the code, the event listener is registered when signal is set in the execution parameters;
Here -> https://github.com/ardatan/graphql-tools/blob/master/packages/executor/src/execution/execute.ts#L499
And GraphQL Yoga sets it when this plugin is used;
https://the-guild.dev/graphql/yoga-server/docs/features/execution-cancellation
Here -> https://github.com/graphql-hive/graphql-yoga/blob/main/packages/graphql-yoga/src/plugins/use-execution-cancellation.ts#L10
So it is disabled by default. That's why it'd be better if we learn more about your setup.
Maybe in the meanwhile, you can disable this plugin if you use it until we figure the issue out.
I also didn't see the version of @graphql-tools/executor you have in your code-base so maybe you can check if it is the latest by using a command like yarn why @graphql-tools/executor or any other variant of this depends on your package manager.
cc @enisdenjo
Thanks for the quick reply. That is a lot of useful information. Maybe I can create a simple example server now which reproduces the error.
Yes, I use useExecutionCancellation(), I will retry the leak test tomorrow without this. It is experimental anyway, I don´t really remember why I activated it in the first place.
Here the why output:
❯ pnpm why @graphql-tools/executor
Legend: production dependency, optional only, dev only
dependencies:
@graphql-tools/stitch 10.0.2
├─┬ @graphql-tools/batch-delegate 10.0.1
│ └─┬ @graphql-tools/delegate 11.0.1
│ └── @graphql-tools/executor 1.4.9
├─┬ @graphql-tools/delegate 11.0.1
│ └── @graphql-tools/executor 1.4.9
├── @graphql-tools/executor 1.4.9
└─┬ @graphql-tools/wrap 11.0.1
└─┬ @graphql-tools/delegate 11.0.1
└── @graphql-tools/executor 1.4.9
@graphql-tools/wrap 11.0.1
└─┬ @graphql-tools/delegate 11.0.1
└── @graphql-tools/executor 1.4.9
@graphql-yoga/render-graphiql 5.16.0
└─┬ graphql-yoga 5.16.0 peer
└── @graphql-tools/executor 1.4.9
graphql-yoga 5.16.0
└── @graphql-tools/executor 1.4.9
Ok, I did two more tests now:
First I stressed it with bombardier using a simple query like:
bombardier -c 3 -d 60s -l -k -H 'content-type: application/json' -m POST -b '{"query":"{findManyLogin {Id} }"}' https://127.0.0.1:4400/graphql/drill.0
This also reproduced the leak with a simple query (i used a custom script using urql and mutations before).
Then I commented out useExecutionCancellation() in my code and re-ran the test. Bombardier made about 40000 requests in one minute.
This was the result:
Absolutely no + deltas anymore, the leak was gone.
Thank you for digging into this, we really much appreciate it! I will be trying to reproduce this leak on our side and will come back to you with my findings.
Ok, I tried a lot of things, I could not get it to leak. 🤔
I made a PR adding a memtest for both Hive Gateway (powered by Yoga) and Yoga itself with execution cancellation. No leaks reported. The memtests work the same way you test, it does a few runs of loadtesting, GCs in between runs, takes heap snapshots and then compares the deltas of each heap snapshot.
I did my local testing matching your Node version: 24.9.0.
You can run the memtests locally yourself with (make sure you have k6 installed):
MEMTEST=clean,keepheapsnaps,short yarn test:mem run execution-cancellation -t yoga
if it passes, you can double-check by loading, in order, the e2e/execution-cancellation/*.heapsnapshots into your DevTools and comparing them as you would.
Maybe Yoga setup for the test is too simple. Maybe there is something else happening.
If the tests pass for you locally too, we would kindly ask you to give more insights or create a reproduction so that we can test exactly that scenario. Thanks for helping us debug this!
Ok,I got this to run on my dev machine.
I will gradually expand yoga-server.ts with my setup now until I (hopefully) can reproduce the problem.
The first dependency from my service I added was fastify. So I modified yoga-server.ts from the test (see Fastify example:
import fastify, { FastifyReply, FastifyRequest } from 'fastify';
import {
createSchema,
createYoga,
useExecutionCancellation,
YogaInitialContext,
} from 'graphql-yoga';
export const schema = createSchema({
typeDefs: /* GraphQL */ `
type Query {
hello: String
}
`,
resolvers: {
Query: {
hello: () => 'world',
},
},
});
// This is the fastify instance you have created
const server = fastify({ logger: true });
// This will allow Fastify to forward multipart requests to GraphQL Yoga
server.addContentTypeParser('multipart/form-data', {}, (_req, _payload, done) => done(null))
// Create a Yoga instance with a GraphQL schema.
const yoga = createYoga<{
req: FastifyRequest;
reply: FastifyReply;
} | YogaInitialContext>({
schema,
plugins: [useExecutionCancellation()],
logging: {
debug: (...args) => args.forEach((arg) => server.log.debug(arg)),
info: (...args) => args.forEach((arg) => server.log.info(arg)),
warn: (...args) => args.forEach((arg) => server.log.warn(arg)),
error: (...args) => args.forEach((arg) => server.log.error(arg)),
},
});
server.route({
// Bind to the Yoga's endpoint to avoid rendering on any path
url: yoga.graphqlEndpoint,
method: ['GET', 'POST', 'OPTIONS'],
handler: (req, reply) =>
yoga.handleNodeRequestAndResponse(req, reply, {
req,
reply,
}),
});
// Start the server and you're done!
server.listen({ port: parseInt(process.env['PORT']!) }, (_error, serverUrl) => {
console.info(`Server is running on ${serverUrl}${yoga.graphqlEndpoint}`);
});
Unfortunately, when I ran
MEMTEST=clean,keepheapsnaps,short yarn test:mem run execution-cancellation -t Yoga
I got an error from the test framework:
time="2025-10-10T11:24:16+02:00" level=error msg="test aborted: Status is not 200, got status 415 and body:\n{\"statusCode\":415,\"code\":\"FST_ERR_CTP_INVALID_MEDIA_TYPE\",\"error\":\"Unsupported Media Type\",\"message\":\"Unsupported Media Type: application/x-www-form-urlencoded\"} at default (file:///home/gunter/tmp/gateway/internal/perf/src/graphql-loadtest-script.ts:27:24(70))"
Not sure what causes this, if I start yoga-server.ts directly it behaves as expected.
The next dependencies I'll try to add (if Fastify doesn´t trigger the memory leak) are:
- @graphql-tools/stitch
- @graphql-tools/wrap
- @graphql-tools/executor-http
You can modify the internal/perf/src/graphql-loadtest-script.ts to match your servers requirements. Could be that the http.post somehow sends the wrong header assuming to post a form instead of json, are you running the latest k6?
Did this to ensure content-type json, can you try with those changes?
Just found this myself, yes running now :)
Results:
Doesn't seem to be the same memory leak though (string, concatenated-string)
And this leak is still there when I remove the useExecutionCancellation plugin.
I continue adding more dependencies now.
(string) and (concatenated-string) are not really what we're looking for here. The former is just a standard string allocated in memory and the latter is a special v8 type of representation used to speed up string concatenations. It's weird but not really something to worry about, it should not have anything to do with the issue at hand here.
Keep us updated!
I didn´t have much time, but today I made a little experiment. I started my server and made a few simple queries.
I put break points in the execute function here:
During a query the code only passed through the line at (2), it never passed though (1). So, if the signal never gets the abort request and the signal is the same every time, doesn´t that mean that event listeners are leaking?
I will continue my experiments when I have more time :)
A little bit debugging more, after 6 requests:
Indeed, the same signal is used every time and the listeners are piling up.
Btw, the stack trace is veeery long to this place here, but I saw batching (which I activated) and some proxying stuff (I use stitching) on the way down, maybe I'll add this to the leak-test
Hey @gunters63, just circling back! Any chance you got those inside the test or can share the leaking setup? I'd like to handle this as soon as possible. Thanks!
Yeah, hopefully I have some time later today
So I started the yoga-server.ts in the execution-cancellation directory manually with node --import tsx yoga-server.ts and made some requests. I noticed that here too it never called signal.removeEventListener.
So, in my theory, it should also leak event listeners here and I wondered why it doesn't.
Then I saw the following in the stack-trace:
So, the memory leak test doesn't seem to use a real AbortController/AbortSignal, but some kind of Test helper. Maybe this test helper does not leak when adding eventlisteners but never removing them?
I didn´t find an easy to way to run the memory leak test with a real AbortController/AbortSignal.
I am getting close :)
I put the following debugging code directly into execute.js:
and of course a let lastSignal = null at the beginning of the module.
When I made requests to the yoga-server.ts in the execution-cancellation directory, I got this output:
❯ PORT=4000 node --inspect --import tsx yoga-server.ts
Debugger listening on ws://127.0.0.1:9229/23e8c889-a6a4-40aa-8b65-a618e99a4cb5
For help, see: https://nodejs.org/en/docs/inspector
Server is running on http://localhost:4000/graphql
different signal
adding listener
different signal
adding listener
different signal
adding listener
different signal
adding listener
When I did some requests on my yoga-server, I got this:
different signal
adding listener
same signal
adding listener
same signal
adding listener
same signal
adding listener
same signal
adding listener
Somehow in my server the same signal is used every time and that leaks the event listeners.
I'll try to play around and add more stuff to the test yoga-server.ts, maybe I find the reason of the shared signal.
Btw, here is how I create the yoga server:
I added some of those options to the test yoga-server in the exection cancellation test directory, but he still doesn´t re-use the signal. It also doesn´t happen when I add fastify.
In my code I never use AbortSignal or AbortController, so I have currently no idea why in my server the same signal is used every query.
It seems to have to do something with @graphql-tools/executor-http
I have a 'global' route where I just serve a schema with Yoga. This route does use a different AbortController for every request:
different signal
adding listener
different signal
adding listener
different signal
adding listener
I have a few other routes where I stitch some schemas for different instances of a Prisma server where I use buildHTTPExecutor to forward the sub schema request.
If I am using those routes:
adding listener
same signal
adding listener
same signal
adding listener
same signal
For this internal forwarding I use a line like this:
const globalExec = buildHTTPExecutor({
endpoint: urlPrefix + globalRouteOptions.url,
// eslint-disable-next-line @typescript-eslint/unbound-method
fetch: globalYoga.fetch, // use Yogas fetch, this does not send an actual HTTP request!
});
I also saw this in buildHttpExecutor:
Maybe this is the place where the single instance AbortController happens.
I'll try to replicate this in the test.
I tried to reproduce the problem in a minimal server, so I modified yoga-server.ts like this:
import { buildHTTPExecutor } from '@graphql-tools/executor-http';
import { stitchSchemas } from '@graphql-tools/stitch';
import fastify, { FastifyReply, FastifyRequest } from 'fastify';
import {
createSchema,
createYoga,
useExecutionCancellation,
YogaInitialContext,
} from 'graphql-yoga';
export const schema = createSchema({
typeDefs: /* GraphQL */ `
type Query {
hello: String
}
`,
resolvers: {
Query: {
hello: () => 'world',
},
},
});
// This is the fastify instance you have created
const server = fastify({ logger: true });
// This will allow Fastify to forward multipart requests to GraphQL Yoga
server.addContentTypeParser('multipart/form-data', {}, (_req, _payload, done) =>
done(null),
);
// Create a Yoga instance with a GraphQL schema.
const subYoga = createYoga<
| {
req: FastifyRequest;
reply: FastifyReply;
}
| YogaInitialContext
>({
schema,
parserAndValidationCache: false,
graphqlEndpoint: '/stitched'
});
server.route({
// Bind to the Yoga's endpoint to avoid rendering on any path
url: subYoga.graphqlEndpoint,
method: ['GET', 'POST', 'OPTIONS'],
handler: (req, reply) =>
subYoga.handleNodeRequestAndResponse(req, reply, {
req,
reply,
}),
});
const executor = buildHTTPExecutor({
endpoint: `https://127.0.0.1:${process.env['PORT']}/stitched`,
fetch: subYoga.fetch,
});
const stitchedSchema = stitchSchemas({
subschemas: [
{
schema,
executor,
},
],
});
const mainYoga = createYoga<
| {
req: FastifyRequest;
reply: FastifyReply;
}
| YogaInitialContext
>({
schema: stitchedSchema,
parserAndValidationCache: false,
plugins: [useExecutionCancellation()],
logging: {
debug: (...args) => args.forEach((arg) => server.log.debug(arg)),
info: (...args) => args.forEach((arg) => server.log.info(arg)),
warn: (...args) => args.forEach((arg) => server.log.warn(arg)),
error: (...args) => args.forEach((arg) => server.log.error(arg)),
},
});
server.route({
// Bind to the Yoga's endpoint to avoid rendering on any path
url: mainYoga.graphqlEndpoint,
method: ['GET', 'POST', 'OPTIONS'],
handler: (req, reply) =>
mainYoga.handleNodeRequestAndResponse(req, reply, {
req,
reply,
}),
});
// Start the server and you're done!
server.listen({ port: parseInt(process.env['PORT']!) }, (_error, serverUrl) => {
console.info(`Stitched server is running on ${serverUrl}${mainYoga.graphqlEndpoint}`);
});
This minimal server has now almost all the parts my real server has.
I still could not reproduce the problem, and the logging statements I added in @graphql-tools/executor-http told me that the AbortController was not shared:
different signal
adding listener
{"level":30,"time":1761308226175,"pid":159743,"hostname":"taubenus","reqId":"req-6","res":{"statusCode":200},"responseTime":1.806793000549078,"msg":"request completed"}
{"level":30,"time":1761308226348,"pid":159743,"hostname":"taubenus","reqId":"req-7","req":{"method":"POST","url":"/graphql","host":"localhost:4000","remoteAddress":"127.0.0.1","remotePort":35726},"msg":"incoming request"}
different signal
adding listener
{"level":30,"time":1761308226350,"pid":159743,"hostname":"taubenus","reqId":"req-7","res":{"statusCode":200},"responseTime":1.799499001353979,"msg":"request completed"}
{"level":30,"time":1761308226504,"pid":159743,"hostname":"taubenus","reqId":"req-8","req":{"method":"POST","url":"/graphql","host":"localhost:4000","remoteAddress":"127.0.0.1","remotePort":35726},"msg":"incoming request"}
different signal
adding listener
I have no idea why the AbortController is not shared in my server (which triggers the memort leak). I give up now, but I still think there is a lingering problem here.
Oh boy, that one is weird! Thank you for working on this, I followed your progress and we will pick this up as soon as we can. I'll try to implement what you shared and get a repro, your insights are helpful.