dd-trace-js
dd-trace-js copied to clipboard
Multiple spans per request + 50% performance drop
Hi, Using apm with a nodejs service. On every request I am seeing multiple spans created where I presume there should only be one. My service is using koa. Tracing is using the standard configuration as described in the docs. I have tried setting config options from code and from env variables but that made no difference.
We have also see a 50% drop in performance when tracing is enabled. Is this a bug..? Is there a workaround..?

- Node version: 12.16.0
Is this a bug..?
It's a bit difficult to know since it is expected that you would have a span for each middleware that is called for the request, but with most middleware being anonymous functions it's hard to determine which one should or should not appear, especially with the 3 dispatch ones. Generally speaking however, it is expected that you would see many middleware spans within a request.
Is there a workaround..?
If you don't want to see middleware, you can turn it off with an option on the plugin:
tracer.use('koa', { middleware: false })
We have also see a 50% drop in performance when tracing is enabled.
This is unlikely to be the middleware spans, but I would still recommend to try the above just in case. If that doesn't work, the issue is likely with async_hooks which is a built-in Node module we use to propagate the tracing context through asynchronous boundaries. Historically there were many issues with async_hooks, and most of them have been fixed in more recent versions of Node.
Can you try upgrading Node to 16.3 to see if that fixes the issue, or at least makes the overhead low enough to not impact the application? If that's not possible in production, at least trying that in some testing environment would help validate that this is indeed the case.
@skimonkey Did you get a chance to try the above for the performance issue?
@rochdev yes we were able to disable the middleware spans as per your suggestion. However we didnt see any CPU improvements. Also upgrading NodeJs showed only a very slight CPU improvement which was most likely due to improvements in NodeJs
Also upgrading NodeJs showed only a very slight CPU improvement which was most likely due to improvements in NodeJs
Correct, we've actually been making improvements to Node directly since Node itself is the biggest overhead for the tracer, more specifically the async_hooks module that we use for context propagation, as described in #1095.
Relevant PRs:
https://chromium-review.googlesource.com/c/v8/v8/+/2759188 https://github.com/nodejs/node/pull/36394 https://github.com/nodejs/node/pull/38821 https://github.com/nodejs/node/issues/39019
Upgrading to at least 16.3.0 (but ideally 16.4.0 since there was a bug in 16.3) should significantly reduce the performance overhead. However, it's possible that either we have some other performance issues elsewhere, or the fix doesn't make that much of an improvement for your app specifically. Since every performance issue to this day were with async_hooks, I would say to start by isolating that. This can be done by disabling or removing the tracer completely, and then adding this snippet to your code:
require('async_hooks').createHook({ init () {} }).enable()
If you still see a significant performance drop with this snippet in the latest version of Node, then it means that the fix we did was not enough and we need to look into this further. If the app runs fine with this snippet, then it means the problem is elsewhere in the tracer, but for now I would focus on async_hooks first since historically it has always been the problem.
@skimonkey were you able to try on the latest Node.js 16 or Node.js 17, to see if the async_hooks fixes help your situation?
Looks like this is pretty stale. Feel free to reopen if it's still relevant.