dd-trace-js Next.JS plugin integration

Env :

OS : MAC/Linux
Datadog agent version: ?
dd-trace: v4.26.0
Node:  v20.10.0
React: v18.3.0-canary-d900fadbf-20230929
Next.JS : v14.0.4
Next’s build type: standalone

Hey 👋🏼 !

I’m working on a Next.JS app with app directory, built in standalone mode, and packaged in a docker image to be deployed on K8s cluster. I’ve made multiple attempts to integrate dd-trace next’s plugin but doesn’t seem to be working :

// instrumentation.node.ts

import Tracer from "dd-trace";

const tracer = Tracer.init({
  logInjection: true,
  startupLogs: true,
});

tracer.use("next");

// instrumentation.ts

export async function register() {
  // NEXT_RUNTIME cannot be frozen
  if (process.env.NEXT_RUNTIME === "nodejs") {
    await import("./instrumentation.node");
  }
}

I do see some traces popping on DD APM UI, but only see methods like GET | POST but no path or route information.

Once I continued digging these traces, it seems that they were created from http plugin instead of next one.

We ended up patching the dd-trace dependency (http plugin) to have something working :

diff --git a/packages/datadog-plugin-http/src/client.js b/packages/datadog-plugin-http/src/client.js
index 42833bb896f64e5cbf37840f4a4087a346715aa5..dc0c552c6dafa297c80ebd77179f1a21accf51a7 100644
--- a/packages/datadog-plugin-http/src/client.js
+++ b/packages/datadog-plugin-http/src/client.js
@@ -42,7 +42,7 @@ class HttpClientPlugin extends ClientPlugin {
         [COMPONENT]: this.constructor.id,
         'span.kind': 'client',
         'service.name': this.serviceName({ pluginConfig: this.config, sessionDetails: extractSessionDetails(options) }),
-        'resource.name': method,
+        'resource.name': `${method} ${uri}`,
         'span.type': 'http',
         'http.method': method,
         'http.url': uri,
diff --git a/packages/datadog-plugin-http/src/server.js b/packages/datadog-plugin-http/src/server.js
index dcf4614819efec27f59a979f360d44c98c0ca4f2..cbc380936e31e4961f7bbee70925245dffaec88d 100644
--- a/packages/datadog-plugin-http/src/server.js
+++ b/packages/datadog-plugin-http/src/server.js
@@ -33,7 +33,11 @@ class HttpServerPlugin extends ServerPlugin {
       res,
       this.operationName()
     )
+    const url = new URL(req.url)
+
     span.setTag(COMPONENT, this.constructor.id)
+    span.setTag('resource.name', `${req.method} ${url.pathname}`)
+
 
     this._parentStore = store
     this.enter(span, { ...store, req, res })
@@ -63,6 +67,9 @@ class HttpServerPlugin extends ServerPlugin {
       incomingHttpRequestEnd.publish({ req, res: context.res })
     }
 
+
+    web.setRoute(req, req.url)
+
     web.finishAll(context)
   }

Am I missing something in my configuration ?

Jan 26 '24 11:01 Tarektouati

import { registerOTel } from '@vercel/otel';

export const register = async () => {
  if (process.env.NEXT_RUNTIME === 'nodejs') {
    const { TracerProvider } = (await import('dd-trace')).default.init({
      logInjection: true,
      startupLogs: true,
    });

    const provider = new TracerProvider();

    registerOTel();
    provider.register();
  }
};

Would you like to try this?

Feb 02 '24 02:02 Sh031224

@Sh031224 Didn't work for me, I also tried to do registerOTel before the init but it also didn't help. Did it work for you?

Feb 05 '24 12:02 Lisenish

@Lisenish The important thing is to transfer otel data to datadog using the provider.

If you only use datadog, it seems that you cannot fully use the spans provided by next.js.

Feb 08 '24 08:02 Sh031224

@Sh031224 Oh, sorry for the late reply 🙇 Actually I was able to see it after my message here, so yeah it seems this approach works.

We still needed to group the resource.name on our own, though, since by default it doesn't group anything, just records each individual URL as a separate resource (to e.g. /items/1, items/2 are separate resources).

tracer.use('http', {
    hooks: {
      request(span, req) {
        if (span && req) {
          const urlString = 'path' in req ? req.path : req.url;

          if (urlString) {
            const url = new URL(urlString, 'http://localhost');
            const path = url.pathname + url.search;
            const resourceGroup = getPathGroup(url.pathname); // our custom function to generilize the url
            const method = req.method;

            span.setTag('resource.name', method ? `${method} ${resourceGroup}` : resourceGroup);
            span.setTag('http.route', method ? `${method} ${path}` : path);
          }

It also creates a lot of weird operations (in addition to web.request) based on the request unique URL, e.g. operation GET items_342223, we decided not to do anything about it for now

Feb 20 '24 03:02 Lisenish

import { registerOTel } from '@vercel/otel';

export const register = async () => {
  if (process.env.NEXT_RUNTIME === 'nodejs') {
    const { TracerProvider } = (await import('dd-trace')).default.init({
      logInjection: true,
      startupLogs: true,
    });

    const provider = new TracerProvider();

    registerOTel();
    provider.register();
  }
};

Would you like to try this?

This almost works - I get an exception on the datadog Tracer implementation


2024-03-01T07:19:50.992697065Z stderr F TypeError: parentTracer.getSpanLimits is not a function

2024-03-01T07:19:50.99269989Z stderr F     at new Span (/app/node_modules/@prisma/instrumentation/node_modules/@opentelemetry/sdk-trace-base/build/src/Span.js:59:41)

2024-03-01T07:19:50.992702455Z stderr F     at /app/node_modules/@prisma/instrumentation/dist/chunk-VVAFFO6L.js:59:20

2024-03-01T07:19:50.992704769Z stderr F     at Array.forEach (<anonymous>)

2024-03-01T07:19:50.992707324Z stderr F     at ActiveTracingHelper.createEngineSpan (/app/node_modules/@prisma/instrumentation/dist/chunk-VVAFFO6L.js:44:27)

2024-03-01T07:19:50.992709588Z stderr F     at Xi.createEngineSpan (/app/node_modules/@prisma/client/runtime/library.js:123:1645)

2024-03-01T07:19:50.992716832Z stderr F     at vt.logger (/app/node_modules/@prisma/client/runtime/library.js:113:1167)

2024-03-01T07:19:50.992719607Z stderr F     at /app/node_modules/@prisma/client/runtime/library.js:113:922

That I'm able to get around by monkey patching the provider

import { registerOTel } from "@vercel/otel";

export async function register() {
  try {
    if (process.env.NEXT_RUNTIME === "nodejs") {
      console.log("Registering tracing");
      process.env.WEIGHTS_SERVICE = "weights-nextjs-serverless";

      const tracer = await import("~/tracing");
      const { PrismaInstrumentation } = await import("@prisma/instrumentation");

      const provider = new tracer.TracerProvider();
      const baseTracer = provider.getTracer.bind(provider);
      provider.getTracer = (name: string, version?: string) => {
        const newTracer = baseTracer(name, version);
        // @ts-ignore
        newTracer.getSpanLimits = () => ({});
        return newTracer;
      };

      registerOTel({
        serviceName: "weights-nextjs-serverless",
        instrumentations: ["auto", new PrismaInstrumentation()],
      });

      // Register the provider globally
      provider.register();
    }
  } catch (e) {
    console.error(e);
  }
}

But then I get an exception with the startSpan method

Registering tracing
TypeError: Cannot read properties of undefined (reading '_traceId')
    at Tracer.startSpan (/var/task/node_modules/dd-trace/packages/dd-trace/src/opentelemetry/tracer.js:38:25)
    at Tracer.startActiveSpan (/var/task/node_modules/dd-trace/packages/dd-trace/src/opentelemetry/tracer.js:112:23)
    at /var/task/node_modules/next/dist/server/lib/trace/tracer.js:122:103
    at AsyncLocalStorage.run (node:async_hooks:346:14)
    at Za.with (file:///var/task/node_modules/@vercel/otel/dist/node/index.js:20:16621)
    at ContextAPI.with (/var/task/node_modules/@opentelemetry/api/build/src/api/context.js:60:46)
    at NextTracerImpl.trace (/var/task/node_modules/next/dist/server/lib/trace/tracer.js:122:28)
    at /var/task/node_modules/next/dist/compiled/next-server/server.runtime.prod.js:16:3795
    at AsyncLocalStorage.run (node:async_hooks:346:14)
    at Za.with (file:///var/task/node_modules/@vercel/otel/dist/node/index.js:20:16621)
Error: Runtime exited without providing a reason
Runtime.ExitError

Mar 01 '24 07:03 jonluca

Hello everyone, I managed to hit the same dead end like most of you here. I am running Next.js 14 with app router.

The only way I managed to get it working (although not sure if it is fully working yet) is to create a JS file server-preload.js

const packageJSON = require('../package.json');

function setUpDatadogTracing() {
	const tracer = require('dd-trace');

	tracer.init({
		runtimeMetrics: true,
		logInjection: true,
		env: 'dev',
		service: `myapp`,
		version: packageJSON?.version ?? 'unknown'
	});
}

setUpDatadogTracing();

And load it within package.json node -r server-preload.js ./node_modules/.bin/next start. Doing this I don't get only GET and POST in Resources and I have GET /_not-found for 404 pages and GET /about etc etc based on the pages I have.

I am also getting the versioning coming through for each new release I make and also the dev envs are set properly.

Logs are ingested also but only the ones that I am logging via an internal logger I made via Pino. The other ones are not coming in as they are not in JSON format.

There is a way in the file above to patch the console log and make it spit out JSON but that is a can of worms because there is lots of cleaning up that needs to be done to make it work and also it could break at any Next update.

Using the instrumentation hook I never managed to get it working, and using the telemetry from Vercel plus DD I always got undefined errors looking for the _traceID in an object.

Even with this setup I am not sure if I can see any spans and I need to check more.

For sourcemaps I am thinking to generate them and load them via the CI before I remove them from the deployed app.

Has anyone found a better way that works with most DD features and can share their setup?

Apr 23 '24 08:04 radum

@radum your solution seem to inspired by this blog post https://jake.tl/notes/2021-04-04-nextjs-preload-hack I've already tried this solution, and it works fine.

Following the root issue, I want to enable Datadog log injection with next.js without preloading any script.

I want to manage it directly from instrumentation.ts|js which are designed for https://nextjs.org/docs/app/building-your-application/optimizing/instrumentation

Apr 23 '24 10:04 Tarektouati

@Tarektouati I found that article while looking for log ingestion but yeah that one helped validate the fact that doing it via instrumentation is never going to work :)

I would like to use the instrumentation hooks but DD is just not working with that or the fact that the hook is still experimental means it has all kinds of issues we don't see.

Apr 23 '24 10:04 radum

This is a incredibly large issue considering Next.js is the largest web framework today.

We are heavily relying on server components and no variations of the setups above works correctly.

Jul 04 '24 09:07 olafurns7

I'm in the same boat as everyone in this thread, Next.js 14 + App Router + RSC, set up dd-trace through instrumentation, enabled OTEL etc. only to get my app to die with HTTP 500 on incoming requests because of TypeError: Cannot read properties of undefined (reading '_traceId')

Preloading script and other hacks like monkey-patching console - no, thank you. If Sentry can add itself into instrumentation in an elegant manner, so should dd-trace. My solution for now is opting out of DataDog for Next.js 14+ apps until there is a sane way of doing this.

Jul 24 '24 14:07 wlechowicz

This seems to work for me.

However, the graphql plugin will not load with a production build next build.

The development build works as expected, I'm not sure what I'm doing wrong:

instrumentation.ts:

export async function register() {
  if (
    process.env.NEXT_RUNTIME === 'nodejs' &&
    process.env.ENV &&
    process.env.SERVICE_NAME
  ) {
    const ddTrace = await import('dd-trace')

    const tracer = ddTrace.default.init({
      env: process.env.ENV,
      service: process.env.SERVICE_NAME,
      version: process.env.SERVICE_VERSION,
      sampleRate: 1,
      profiling: true,
      runtimeMetrics: true,
      logInjection: true,
      dogstatsd: {
        hostname: 'localhost',
        port: 8125,
      },
    })

    // Monitor GraphQL
    tracer.use('graphql', {
      enabled: true,
      measured: true,
    })

    // Monitor Next.js
    tracer.use('next', {
      enabled: true,
      measured: true,
    })

    // Monitor Winston Logger
    tracer.use('winston', {
      enabled: true,
    })

    const provider = new tracer.TracerProvider()

    provider.register()
  }
}

graphql integration is not loading with nextjs production build

"integrations_loaded":["fetch","winston","http","net","child_process"]

Aug 09 '24 11:08 neilkumar-circle

In addition to the above, I have solved this with updating my start script: "start": "node -r dd-trace node_modules/.bin/next start",

This is obviously a workaround, I'm not sure if it's a prod bundler issue which is leading to dd-trace not being imported as early as it is in the dev build?

Aug 12 '24 09:08 neilkumar-circle

Hey everyone, here's our documentation on how to use Datadog with Next.js: https://docs.datadoghq.com/tracing/trace_collection/compatibility/nodejs/#complex-framework-usage

The approach in the blog post by Jake appears to be the same as what we suggest. You'll need to use the --require ... flag or the NODE_OPTIONS='--require ... environment variable to instruct Node.js to load the tracer before Next.js is loaded.

I wouldn't necessarily refer to this as a "workaround" or a "hack". The --require flag should be pretty stable in Node.js so this solution shouldn't stop working one day. The problem with the Next.js "instrumentation.js|ts" approach (I'm assuming as I haven't dug through their code yet) is that by the point in time that Next.js executes that file, Next.js has already required/imported a bunch of files used by itself, and potentially subdependencies that also need to be instrumented. That approach is incompatible with the dd-trace approach where it needs to be loaded prior to those other modules/libraries being required/imported.

That said, it sounds like a competing APM tool is able to work by using the instrumentation.js file, so there may be a way to support such an approach. Please create a helpdesk feature request (should be available from the GitHub new issue screen) as this will help prioritize such a feature.

Aug 20 '24 18:08 tlhunter

The approach in the blog post by Jake appears to be the same as what we suggest. You'll need to use the --require ... flag or the NODE_OPTIONS='--require ... environment variable to instruct Node.js to load the tracer before Next.js is loaded.

@tlhunter thanks for the reply, the way I have it set up is: running node -r dd-trace ... to preload the whole package and then using the .config(...) method to configure the tracer within the nextjs instrumentation hook.

I think the docs refer to requiring initialize (node --require dd-trace/initialize). Would using dd-trace/initialize mean that we would have to configure the tracer with ENV variables or could we still use the approach that I outlined above?

Aug 20 '24 18:08 neilkumar-circle

@neilkumar-circle you should be fine using -r (alias for --require) which points to your own local module which configures and calls the dd-trace init() programmatically.

The dd-trace/initialize file in the tracer package is just a convenience to load the tracer using default configuration and which depends on env vars for config.

Aug 20 '24 18:08 tlhunter

@tlhunter I think the docs need to better highlight the steps one needs to take for frameworks. Digging that highlight is a huge pain. But thank you for sharing.

So if I have in my package.json node -r ./server-preload.js ./node_modules/.bin/next start and the server-preload.js does the init like this:

function setUpDatadogTracing() {
	const tracer = require('dd-trace');
	console.log('Setting up Datadog tracing');

	tracer.init({
		runtimeMetrics: true,
		logInjection: true,
		profiling: true,
	});
	tracer.use('next');
}

setUpDatadogTracing();

Is the same as using dd-trace/initialize?

Aug 21 '24 05:08 radum

@radum the dd-trace/initialize file does some additional work as well, such as assisting with ESM loading. It's a bit of a multi-purpose helper file.

If your application does not already use -r dd-trace/initialize, then yes you should simply be able to use your -r ./server-preload.js solution. Your file will correctly initialize and configure the tracer early enough that it will work with Next.js.

If your application does already depend on using -r dd-trace/initialize then it wouldn't be equivalent to replace the -r flag with a different file like your -r ./server-preload.js approach as it would leave out some of the ESM niceties.

/cc @bengl who contributed the most to that part of the tracer.

Aug 21 '24 16:08 tlhunter

Thank you @tlhunter Can you explain what are those ESM niceties in more details for us to understand if its worth switching?

Aug 22 '24 08:08 radum

For those who host on Vercel:

I was able to sort of workaround this by sending OpenTelemetry traces to a separate server running dd-agent w/ an API key protected endpoint for submitting traces. This allows collecting traces even on Vercel hosted apps.

This can be done just by setting the env vars: OTEL_EXPORTER_OTLP_ENDPOINT and OTEL_EXPORTER_OTLP_HEADERS (with trace endpoint authorization headers)

Aug 22 '24 16:08 meyer9

@meyer9, could you please elaborate a bit on your setup? I've been trying to get our Vercel-hosted app to properly forward traces, and I haven't been having much luck.

What does your Vercel configuration look like? Are you using the experimental telemetry hook? Are you importing and using the @vercel/otel package?

separate server running dd-agent w/ an API key protected endpoint

Do you mean a separate API key that is specifically used to auth to this server rather than the DD API key used to send traces? If so, what did you use to set that up?

OTEL_EXPORTER_OTLP_HEADERS (with trace endpoint authorization headers)

I'm guessing this is where you're specifying that API key that you're sending along. Again, curious if it's separate from the DD API key.

Aug 22 '24 19:08 johnford2002

dd-trace-js dd-trace-js copied to clipboard

Next.JS plugin integration

dd-trace-js
dd-trace-js copied to clipboard