opentelemetry-js icon indicating copy to clipboard operation
opentelemetry-js copied to clipboard

[@opentelemetry/instrumentation-fetch] Memory leak in NextJS app

Open cpatti97100 opened this issue 1 year ago • 8 comments
trafficstars

What happened?

Steps to Reproduce

NodeJS 18 (but also 20) NextJS 13

Stress test by having Node perform a significant number of outgoing requests using fetch

Expected Result

App memory usage remains almost constant in time, like it was before introducing @opentelemetry/instrumentation-fetch

image

Actual Result

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory

image

this started to happen after introducing @opentelemetry/instrumentation-fetch

Additional Details

Related work: https://github.com/open-telemetry/opentelemetry-js/pull/4063 https://github.com/open-telemetry/opentelemetry-js/issues/3413 https://github.com/open-telemetry/opentelemetry-js/issues/4333 https://github.com/open-telemetry/opentelemetry-js/pull/4393

Now that I think of it, can these 3 packages be active together?

const { FetchInstrumentation } = require('@opentelemetry/instrumentation-fetch')
const { HttpInstrumentation } = require('@opentelemetry/instrumentation-http')
const { NetInstrumentation } = require('@opentelemetry/instrumentation-net')

OpenTelemetry Setup Code

instrumentation.node.js

const process = require('process')
const { Resource } = require('@opentelemetry/resources')
const { NodeSDK } = require('@opentelemetry/sdk-node')
const { DnsInstrumentation } = require('@opentelemetry/instrumentation-dns')
const { FetchInstrumentation } = require('@opentelemetry/instrumentation-fetch')
const { HttpInstrumentation } = require('@opentelemetry/instrumentation-http')
const { NetInstrumentation } = require('@opentelemetry/instrumentation-net')
const { PinoInstrumentation } = require('@opentelemetry/instrumentation-pino')
const { OTLPTraceExporter } = require('@opentelemetry/exporter-trace-otlp-grpc')

function initTelemetry(endpoint) {
  const sdk = new NodeSDK({
    instrumentations: [
      new DnsInstrumentation(),
      new FetchInstrumentation(),
      new HttpInstrumentation(),
      new NetInstrumentation(),
      new PinoInstrumentation(),
    ],
    resource: new Resource({
      'service.cluster': process.env.PROGRAM,
      'service.environment': process.env.ENVIRONMENT,
      'service.module': process.env.MODULE,
      'service.name': process.env.APP_NAME,
      'service.product': process.env.PRODUCT,
      'service.team': process.env.TEAM,
      'service.tenant': process.env.TENANT,
      'service.version': process.env.VERSION,
    }),
    traceExporter: new OTLPTraceExporter({
      url: endpoint,
    }),
  })

  sdk.start()

  process.on('SIGTERM', () => {
    sdk
      .shutdown()
      .then(() => console.log('Tracing terminated'))
      .catch((error) => console.log('Error terminating tracing', error))
      .finally(() => process.exit(0))
  })
}

if (process.env.OTEL_EXPORTER_OTLP_ENDPOINT) {
  initTelemetry(process.env.OTEL_EXPORTER_OTLP_ENDPOINT)
}

package.json

{ "dependencies": {
    "@opentelemetry/api": "^1.7.0",
    "@opentelemetry/exporter-trace-otlp-grpc": "^0.46.0",
    "@opentelemetry/instrumentation-dns": "^0.32.5",
    "@opentelemetry/instrumentation-fetch": "^0.46.0",
    "@opentelemetry/instrumentation-http": "^0.46.0",
    "@opentelemetry/instrumentation-net": "^0.32.5",
    "@opentelemetry/instrumentation-pino": "^0.34.5",
    "@opentelemetry/resources": "^1.19.0",
    "@opentelemetry/sdk-node": "^0.46.0",
}}

Relevant log output

No response

cpatti97100 avatar Feb 21 '24 15:02 cpatti97100

@cpatti97100 @opentelemetry/instrumentation-fetch is not intended for use with Node.js, does removing it get rid of the memory leak? :thinking:

pichlermarc avatar Feb 21 '24 17:02 pichlermarc

It would be extremely useful if you could identify the objects that are being retained and causing the out of memory. Along with maybe a small repo so that anyone can reproduce this without constructing their own version of the test.

MSNev avatar Feb 21 '24 17:02 MSNev

@cpatti97100 @opentelemetry/instrumentation-fetch is not intended for use with Node.js, does removing it get rid of the memory leak? 🤔

it does, but here https://github.com/open-telemetry/opentelemetry-js/pull/4063 I read "The motivation for this change is so I can instrument my NodeJS applications using fetch() with the @opentelemetry/instrumentation-fetch package.". Also here https://github.com/open-telemetry/opentelemetry-js/issues/3413#issuecomment-1952340804 Node compatibility is mentioned. I guess I misunderstood something here or something is confusing :)

cpatti97100 avatar Feb 22 '24 09:02 cpatti97100

@cpatti97100 @opentelemetry/instrumentation-fetch is not intended for use with Node.js, does removing it get rid of the memory leak? 🤔

it does, but here #4063 I read "The motivation for this change is so I can instrument my NodeJS applications using fetch() with the @opentelemetry/instrumentation-fetch package.". Also here #3413 (comment) Node compatibility is mentioned. I guess I misunderstood something here or something is confusing :)

Yes, sorry for the confusion - any support added was purely experimental. There are differences in Node's fetch and the web version. We plan on adding an undici instrumentation that will add support for Node.js' fetch (which uses Undici in the background). That'll then also be properly tested with Node.js (which is not the case with the fetch instrumentation currently).

In the meantime we'll change this instrumentation to be a no-op for Node.js to avoid any issues further issues like this.

That being said, it'd still be helpful to have a reproducer as MSNev said. It could be that this is not only affecting Node.js but also the browser instrumentation.

pichlermarc avatar Feb 22 '24 10:02 pichlermarc

sorry I am super busy right now :( but what I will do is to try out the implementation for undici which is under development and report if the same behavior still happens

cpatti97100 avatar Mar 05 '24 10:03 cpatti97100

@cpatti97100 we've published the undici instrumentation recently. Would you be able to test that one out? :slightly_smiling_face:

pichlermarc avatar Apr 17 '24 16:04 pichlermarc

@cpatti97100 did you try the new undici instrumentation out yet? We're looking into using it for the Sentry Node.js SDK, and implicitly the Sentry Next.js SDK.

lforst avatar Apr 18 '24 10:04 lforst

@cpatti97100 any updates?

david-luna avatar May 13 '24 13:05 david-luna

Hello everyone, we should deploy our app in prod soon with instrumentation-undici. I will let you know how things will go 🤞🏼

cpatti97100 avatar Jun 03 '24 12:06 cpatti97100

@cpatti97100 any updates on this? :slightly_smiling_face:

pichlermarc avatar Jun 19 '24 16:06 pichlermarc

@cpatti97100 closing this issue, if you still run into trouble with the new unidici instrumentation, please open a new issue over at https://github.com/open-telemetry/opentelemetry-js-contrib

pichlermarc avatar Jun 26 '24 16:06 pichlermarc

hi all, just to let yo know that we had @opentelemetry/[email protected] live in prod for over a week and it's working correctly

cpatti97100 avatar Jul 01 '24 08:07 cpatti97100