[BUG]: memory usage on dd-trace-js
Tracer Version(s)
5.53.0
Node.js Version(s)
22
Bug Report
Hi, We are experiencing high memory usage with latest dd-trace versions. We've tested from versions 5.40 to 5.53.0 and are getting an OOM. With dd-trace 5.22.0 we don't get OOM errors. The following graph shows usage when updating our tracer:
Downgrading to 5.22.0 usage stabilizes again.
Related to https://github.com/DataDog/dd-trace-js/issues/5554
Reproduction Code
No response
Error Logs
No response
Tracer Config
No response
Operating System
No response
Bundling
Unsure
@diecgia did you try 5.28?
This other ticket mentioned it as also being a good version: https://github.com/DataDog/dd-trace-js/issues/5690
I am facing a similar issue, I will try downgrading to 5.28 to test.
Hi @JCMais, I've tried 5.28, and it seems to work, I don't get OOM errors with this version.
We are currently trying to gather more information about each individual cases of this memory leak issue. Some of what we need can be shared publicly on GitHub, but some would require a private channel, so ideally I would recommend opening a support ticket. Please feel free to share the ticket number in this issue or send it directly to me on our public Slack so that I can expedite the escalation process.
In the support ticket, please provide the following information:
- If the issue appeared after an upgrade, what version did the issue appear in?
- Please be as precise as possible in the exact version when the issue first appeared. This will allow us to isolate the code change that is responsible. For example, reporting that 5.0.0 works but 5.50.0 doesn't is not as helpful as knowing that 5.1.2 works but 5.1.3 doesn't.
- Since we had a different issue with runtime metrics in 5.41.1, please make sure to disable them with
DD_RUNTIME_METRICS_ENABLED=falsebefore any bisecting to avoid any false positive.- If disabling runtime metrics resolves the issue, let us know as well as that would mean the leak is there.
- If the issue happens with all other products disabled except tracing, the issue is likely in one of our integrations. I would recommend trying to disable individual integrations to isolate the issue to one of them. Integrations can be fully disabled with for example
DD_TRACE_INSTRUMENTATIONS_DISABLED=express,mysqlandDD_TRACE_PLUGINS_DISABLED=express,mysql. You can find the full list of integrations enabled for the service in startup logs (which can be enabled as described below) - Do you have any other services that have or don't have the issue?
- If yes, are there any obvious differences between the ones that do and the ones that don't?
- Please provide the following if possible:
- Your
package.json - Startup logs, which can be outputted by starting the service with
DD_TRACE_STARTUP_LOGS=true - [optional] Debug logs, which can be outputted by starting the service with
DD_TRACE_DEBUG=true.- Note: this is extremely verbose, so enable this with caution, ideally in a dev or staging environment.
- [optional] Two heap dumps, one after 1h of starting the service and another one 2h after.
- If you can provide even more heap dumps, for example after waiting another hour and calling
gc()a few time that's even better. Thegcfunction can be exposed by starting the service withNODE_OPTIONS='--expose-gc', and it needs to be called more than once for a full GC to happen.
- If you can provide even more heap dumps, for example after waiting another hour and calling
- Any other information you deem relevant about your environment or the application itself.
- Your
If you know of a version that works for you and doesn't have the memory leak, please keep using it for now until we update this issue with a resolution.
Thank your for your patience and understanding as we're investigating this issue.
Hello @rochdev, Thanks for your response, I've created the following support ticket: https://help.datadoghq.com/hc/en-us/requests/2164066
@diecgia Thank you for the support ticket. I took a look and while I'm not sure yet exactly what the problem is, I think the information you provided helped me narrow it down to a specific code change. I'll try to add some additional config options to the library to further narrow down the issue.