sentry-javascript
sentry-javascript copied to clipboard
[node][7.102.0] Suspicion of Memory Leak
Is there an existing issue for this?
- [X] I have checked for existing issues https://github.com/getsentry/sentry-javascript/issues
- [X] I have reviewed the documentation https://docs.sentry.io/
- [X] I am using the latest SDK release https://github.com/getsentry/sentry-javascript/releases
How do you use Sentry?
Sentry Saas (sentry.io)
Which SDK are you using?
@sentry/node
SDK Version
7.102.0
Framework Version
espress.js
Link to Sentry event
No response
SDK Setup
Sentry.init({
dsn: process.env.SENTRY_DNS,
release: 'n/a',
environment: process.env.NODE_ENV,
integrations: [
new Sentry.Integrations.Http({ tracing: true }),
new Sentry.Integrations.Express({
app,
methods: ['post', 'get', 'delete'],
}),
],
tracesSampleRate: 0,
});
Steps to Reproduce
- add
app.use(Sentry.Handlers.requestHandler());
- run request
- check profile
Expected Result
n/a
Actual Result
I would like to point out that the problem is not one-to-one. That is, not every request gets stuck in memory. After dozens of requests, only one object can be added, and sometimes more. I didn't find any particular pattern.
You may notice that multiple hubs are created, and the data (in locals) is not removed from the response object (and the response object is not deleted either).
if remove app.use(Sentry.Handlers.requestHandler());
, there are no memory issues.
Hey @xr0master thanks for writing in!
Unfortunately this is going to be quite hard to figure out without a reproducible example. Any chance you could provide one?
Another question:
You may notice that multiple hubs are created
Are you creating multiple hubs on purpose? Or is this just something you observed with the single Sentry.init
call you mentioned in the issue?
Hey @Lms24
Unfortunately this is going to be quite hard to figure out without a reproducible example. Any chance you could provide one?
It's definitely not easy... I'll try to do it.
Are you creating multiple hubs on purpose? Or is this just something you observed with the single Sentry.init call you mentioned in the issue?
Second option. Just regular integration with express.js. As far as I remember, only one hub should be created for the entire application, is it correct?
@Lms24 https://github.com/xr0master/sentry-memo-leak
npm run build
npm run dev
get request http://localhost:5001/logs
Since the application is small, catching this issue is even more difficult. Dozens of requests with different intensities with several snapshots, and I managed to reproduce it. Good luck with this.
@xr0master thanks for the repro and for all the details! Debugging such leaks is very hard but we'll try to get to the bottom of this. One question though: Did this only happen in the most recent version? Have you by any chance tried an older SDK version where this didn't happen before?
We noticed a memory leak in our production environment and am pretty sure it might be the same issue. Just dumping some info in case any of this helps.
We were on version @sentry/node 7.91.0, and upgraded to 7.104.0 during our routine package maintenance. We noticed the memory leak in production. I took 7 or so memory snapshots that i still have. When comparing the snapshots I noticed a constant increase in the Delta size, especially in strings and objects. Looking more into the string ones they all looked something like this:
Always referring to hub.js -> transaction.js like in @xr0master screenshot.
As xr0master, we are also running express and the same set up as in the config. Node 18 on elastic beanstalk. We did have the node/tracing package installed which i believe is deprecated. We did remove it, but I assume that wasn't what fixed it.
We downgraded about 2 hours ago to 7.91.0 and memory has been normalized since.
@Lms24 let me know if there is anything we can provide to assist.
Same here: memory leak after upgrading Sentry from 7.84.0 to 7.102.0. Reverted back to 7.84 and we're OK again. Node v18.17.0, also noticed this in Node v20.11.1
Echoing this, we have upgraded from 6.2.2 to 7.101.1 and are encountering severe memory leaks with similar heap snapshots as those posted above!
Hey thanks a lot for everyone's input so far! Based on your replies, I think we can narrow things down a little bit and we'll try to get this prioritized as a couple of people have echoed the issue.
What we know for now:
- Affected SDK
@sentry/node
- Seems to be connected to using the SDK in an Express app
- Suspected version range:
>@sentry/[email protected]
- If anyone has used a more recent SDK version where the leak does not yet occur, please let us know!
As far as I can tell, the reproduction repository supplied by @xr0master doesn't reproduce a memory leak. I've load tested the endpoint and regardless of whether I use v7.84.0, v7.91.0 or v7.107.0, those ~6 Hub
instances in the screenshots are the only instances.
If anyone can send me some heap snapshots I'd be happy to take a look at those!
FWIW, we tried to downgrade to 7.91.0 and also encountered memory leaks, so my personal suspicion is that the issue was introduced before that. Again, we upgraded from 6.2.2 to 7.101.1 which would probably need bisecting to figure out where the leak was introduced.
FWIW, we tried to downgrade to 7.91.0 and also encountered memory leaks, so my personal suspicion is that the issue was introduced before that. Again, we upgraded from 6.2.2 to 7.101.1 which would probably need bisecting to figure out where the leak was introduced.
That's interesting because we downgraded to 7.91.0 and it resolved our issues. Only for our node project though, next / sveltekit are both running on 7.104.0 (no memory leak for those)
@Patrick-Ullrich would you be able to provide a reproduction example (or maybe even give us access to your project to take a look)? We were unsuccessful in reproducing with @xr0master's repo. We'd like to take this seriously because a memory leak would obviously be really bad.
@lforst I'm sure any senior engineer with experience can easily reproduce the issue.
@xr0master I guess we're not senior enough then 😄
@lforst It's okay. Just work a little bit harder, and one day, you will become one of them.
In any case, I doubt that anyone will give you access to production code or a snapshot from production since this would be a big blow to the project's security.
@xr0master
work a little bit harder, and one day, you will become one of them
doing my best!
In any case, I doubt that anyone will give you access to production code or a snapshot from production since this would be a big blow to the project's security.
It has happened before and helped tremendously. Also lets us rule out configuration issues. Having a working minimal reproduction and fixing this is in everybody's interest.
@lforst servus
Sorry for the late reply, I finally had some time to look more into this. Similar to you, I was unable to reproduce the memory leak using @xr0master s repo. Or at least memory never really spiked as much as I anticipated, no matter how much load i put on the server. I ended up taking our app and stripped things down until I finally found the culprit.
I dropped a reproduction here (thx @xr0master , i just repurposed yours) with some instructions on how to monitor / reproduce it.
https://github.com/Patrick-Ullrich/sentry-mem-leak
I believe the culprit is setting tracingHandler
before requestHandler
. Which the docs do point out not to do.
https://docs.sentry.io/platforms/node/guides/express/:
// The request handler must be the first middleware on the app
app.use(Sentry.Handlers.requestHandler());
// Creates a mem leak
app.use(Sentry.Handlers.tracingHandler());
app.use(Sentry.Handlers.requestHandler());
// Everything is fine
app.use(Sentry.Handlers.requestHandler());
app.use(Sentry.Handlers.tracingHandler());
Since we have tons of different middleware it took a while to notice that they were reversed in our main app and it never seemed to introduce an issue before.
@lforst let me know if this helps and / or if i can help in any other way.
I also went through some versions to see when it was introduced. It seems to start in 7.100.0. 7.99 and before works just fine.
Take what I write here with a grain of salt but to me it kinda makes sense that the incorrect order causes issues.
The requestHandler
creates a new async context/execution context to avoid leaking data between requests. The tracingHandler
on the other hand will create a root span/transaction and set it on the current scope. Now the unfortunate interaction here is that with the incorrect order:
- Request comes in
-
tracingHandler
starts a span/transaction and puts it onto the current hub's scope -
requestHandler
creates a new async context which in turn creates a new hub with a new scope - The request happens within the async context and finishes
-
requestHandler
async context is exited -
tracingHandler
ends the span/transaction.
My guess is that this works fine as long as the server load is low but if multiple simultaneous requests are going on, the scope that's active when the tracingHandler
is doing its thing might loose transactions which should actually be isolated via async context. Now for some reason (probably because we're keeping references to them somewhere) the "lost" transactions aren't garbage collected.
To other folks affected by the leak in this thread: Does switching the order (i.e.) requestHandler
before tracingHandler
fix the issue for you?
Also worth noting that with v8 of the SDK, these handlers won't be required anymore as performance instrumentation is handled by OpenTelemetry under the hood.
We don't use tracingHandler. I've tried to distill our use of sentry in our main app file (Coffeescript), see below:
beforeSend = (event, hint) ->
if _.get(hint, "originalException.skipSentry")
logger.debug "Not sending error to sentry: #{event.exception.values[0].value}"
return null
isNonErrorException = _.get(event, "exception.values[0].value", "").startsWith('Non-Error exception captured') or _.get(hint.originalException, "message", "").startsWith('Non-Error exception captured')
if isNonErrorException
return null
return event
# See here: https://github.com/getsentry/sentry-javascript/issues/1661
filterIntegrations = (defaultIntegrations) ->
return defaultIntegrations.filter((x) -> not x.name.match(/onuncaught|onunhandled/i))
Sentry = require "@sentry/node"
Sentry.init({sendTimeout: 5, beforeSend: beforeSend, integrations: filterIntegrations})
process.on("unhandledRejection", (reason, p) ->
logger.error("Unhandled Rejection at: Promise #{p}, reason: #{reason}")
if reason.stack? and (config.nodeEnv is "development")
logger.error reason.stack
Sentry.captureException(reason)
)
process.on("uncaughtException", (err, origin) ->
logger.error "Uncaught Exception, origin: #{origin}, reason: #{err.message}"
#if reason.stack?
# logger.error reason.stack
Sentry.captureException(err)
)
# Setting up the Express app
app = express()
app.use(Sentry.Handlers.requestHandler())
app.use(express.json({limit: "25mb", type: ["application/json", "application/vnd.api+json"]}))
require("./routes")(app)
app.use Sentry.Handlers.errorHandler({shouldHandleError: (err) ->
if err.isJoi
return false
if err.status and err.status > 499
return true
return false
})
I have a feeling that it's this PR because with that one onwards we started storing scopes on spans. I haven't investigated yet but we may have a bunch of hard references keeping a tree of stuff around. We should fix this somehow.
Fwiw, we don't use Sentry in our express config at all, only in our Vue application and also noticed this memory leak (after upgrading 7.84.0 to 7.102.0).
We simply execute Sentry.init with some config options: ignoreErrors, environment, attachStacktrace (true), release, debug (false, unless on dev/test), tracesSampleRate set to 0, sampleRate set to a very low figure (as we have lots of traffic), allowUrls (regex with only our domain), and the Vue app itself (for the Sentry.Vue, for server side rendering / Sentry.Node we can't supply the app).
Testing is quite time-consuming, but if you have a suspect version I'm happy to help with a loadtest on our application with a before/after.
@daan-nu if you could provide heap snapshots or similar would be a huge help for us to debug further.
for the Sentry.Vue, for server side rendering
I recommend you don't use @sentry/vue
for SSR at all fwiw. It relies on browser APIs, and even does stacktrace parsing based expected stacktraces from a browser, so running it in a server environment is not recommended.
We are experiencing this issue as well on production with next.js.
Adding a screenshot that shows the memory increase from version 7.95.0 -> 7.110.1
. I can't provide reproduction sorry.
Snapshot from grafana memory tracker.
@sarunast did you check the order of our handlers as described in this comment above?
@Lms24 since he mentioned that this is a Next.js app any express handler shenanigans are unlikely (though not impossible).
@sarunast I'd be interested in whether you have any more involved Sentry code somewhere in your app or whether you just use what we recommend as a default. 🤔
@Lms24 since he mentioned that this is a Next.js app any express handler shenanigans are unlikely (though not impossible).
@sarunast I'd be interested in whether you have any more involved Sentry code somewhere in your app or whether you just use what we recommend as a default. 🤔
As you said it's next.js so we don't configure it for express. We use very basic recommended next.js sentry setup so there is nothing much to add. There is one odd thing about the setup. We do not deploy via vercel rather our own custom docker'isd deployments to AWS. Due to this custom setup sentry works only for client and the backend reporting doesn't even work. Not sure if it helps it would be great to fix the backend reporting but I think it's not related to this issue?
As you said it's next.js so we don't configure it for express
Apologies, I completely missed this in your first comment.
Unfortunately, I think we're talking about multiple different memory leaks and scenarios by now. The only reproduction we received so far was for the Node SDK which no longer leaks memory if the order of registering the Sentry Node/Express handlers is corrected. Furthermore, users are mentioning different version ranges when the leak started happening.
If you're able to provide a minimal reproduction for the leak you experience in your NextJS, we'd greatly appreciate it!
Perhaps it is @Lms24 but what caught my attention was this comment:
I also went through some versions to see when it was introduced. It seems to start in 7.100.0. 7.99 and before works just fine.
Which is true for us.
Looking at https://github.com/getsentry/sentry-javascript/blob/develop/CHANGELOG.md#71000
The related changes are:
- fix(nextjs): Do not report redirects and notFound calls as errors in server actions (#10474)
- fix(nextjs): Fix navigation tracing on app router (#10502)
- fix(nextjs): Apply server action data to correct isolation scope (#10514)
- fix(node): Use normal require call to import Undici (#10388)
What Node version is everyone using? What SDK features is everyone using (errors, performance, profiling, etc.). For Next.js in particular are you using the app or pages router?
Because of the variety of different permutations of possible setups, sharing your package.json
, next.config.js
, etc. would be extremely helpful to allow us to narrow this down!
@AbhiPrasad I think it is this change where we introduce strong references to scopes on spans https://github.com/getsentry/sentry-javascript/pull/10492