opentelemetry-js-contrib Circular reference in _scrubStatement causes maximum call stack exceeded in instrumentation-mongodb

I am using mongoose which seems to have added a __parentArray, to an object that is part of an array. Hence there is a circular reference and so _scrubStatement then throws: @opentelemetry/instrumentation-mongodb Error running dbStatementSerializer hook RangeError: Maximum call stack size exceeded at /Users/james/Library/CloudStorage/Dropbox/growthbookzip/growthbook/node_modules/@opentelemetry/instrumentation-mongodb/build/src/instrumentation.js:581:65 at Array.map (<anonymous>) at MongoDBInstrumentation._scrubStatement

This was introduced in instrumentation-mongodb.

You probably need to add a circular reference check. Or some hard coded check for __parentArray, since I'm assuming mongoose is pretty common.

What version of OpenTelemetry are you using?

    "@opentelemetry/api": "^1.9.0",
    "@opentelemetry/auto-instrumentations-node": "^0.47.1",
    "@opentelemetry/exporter-metrics-otlp-proto": "^0.52.0",
    "@opentelemetry/sdk-metrics": "^1.25.0"

What version of Node are you using?

v20.12.2

What did you do?

Ran some DB query that got automatically scrubbed.

My mongoose schema has a property phases that is defined as:

 phases: [
    {
      _id: false,
      dateStarted: Date,
      dateEnded: Date,
      phase: String,
      name: String,
      reason: String,
      coverage: Number,
      condition: String,
      savedGroups: [
        {
          _id: false,
          ids: [String],
          match: String,
        },
      ],
      prerequisites: [
        {
          _id: false,
          id: String,
          condition: String,
        },
      ],
      namespace: {},
      seed: String,
      variationWeights: [Number],
      groups: [String],
    },
  ],

It probably was a create statement.

What did you expect to see?

No Errors.

What did you see instead?

The maximum call stack error above.

Additional context

None

Jun 14 '24 12:06 tzjames

@Fox32 Pinging you since I think you introduced it in your PR a week ago. Pinging @david-luna as well as you may have context here having approved it.

Thanks for your help!

Jun 14 '24 12:06 tzjames

Trying to wrap my head around what is happening. Can you provide an example? I wonder how MongoDB itself is handling that when serializing the statement.

Jun 24 '24 16:06 Fox32

@tzjames

Is it possible to reproduce by checking out the source branch of this PR and run the app?

Jul 15 '24 14:07 david-luna

@david-luna Yes. Sorry. I was trying at one point to make a simpler app to reproduce it but the api_1.context.active() was not being set for it. So I failed.

But yes it is reproducible if you checkout the ji/monitor_job_queue branch of the growthbook/growthbook repo.
Then get Mongo up and running with:

docker run -d -p 27017:27017 --name mongo \
  -e MONGO_INITDB_ROOT_USERNAME=root \
  -e MONGO_INITDB_ROOT_PASSWORD=password \
  -v ~/gb_mongo_data/:/data/db \
  mongo

then run yarn and yarn build and yarn start:with-tracing. Go to localhost:3000 Create a new org Go to http://localhost:3000/getstarted/experiment-guide click "View sample experiment" Click the three dots next to "stop experiment" in the upper right and click 'edit status' and change the dropdown to 'draft'. Click "start experiment". Click "start anyway" then in the logs you should see the Maximum call stack size exceeded.

Jul 16 '24 16:07 tzjames

For what it's worth here is code within Mongoose that handles this for the spread operator on docs: https://github.com/Automattic/mongoose/blob/e4462c6c80b8b395187b382f7dcb3c398293cd78/lib/helpers/document/handleSpreadDoc.js#L5

ScrubStatement is not using the spread operator but perhaps similar code is necessary.

Jul 22 '24 11:07 tzjames

Is there a workaround on this?

Aug 18 '24 17:08 alko89

Is there a workaround on this?

@alko89 We have a different issue with the newly introduced _scrubStatement and using a workaround by providing our own dbStatementSerializer which would no longer call that method.

Nov 04 '24 21:11 DanStory

Some thoughts on this after poking around a bit. Caveat: I don't use MongoDB or really know it that well.

semconv for mongodb spans (https://github.com/open-telemetry/semantic-conventions/blob/main/docs/database/mongodb.md) doesn't cover db.query.text (the new name for db.statement). The only reference is an example span where db.query.text: not set.
General semconv discussion of semconv spans says to do what is being done by the scrub function (https://github.com/open-telemetry/semantic-conventions/blob/main/docs/database/database-spans.md):

The db.query.text SHOULD be collected by default only if there is sanitization that excludes sensitive information. Sanitization SHOULD replace all literals with a placeholder value. Such literals include, but are not limited to, String, Numeric, Date and Time, Boolean, Interval, Binary, and Hexadecimal literals.
It would, I suppose, be nice to skip document properties added by mongoose: https://github.com/Automattic/mongoose/blob/b34aba65bd64540e330665477f542eb79c877909/lib/helpers/document/compile.js#L213-L235 However, it would be good to understand if why the scrub function was iterating over __parentArray, because the comment for that code in mongoose suggests it makes those property descriptors enumerable = false.
A depth-limiter on the scrubbed document sounds good. DD's impl limits to a depth of 10, FWIW. (If I read it correctly, DD's impl also skips arrays, so collection.aggregate(...) usage wouldn't include a db statement.)
It would be great to get a real code example (for tests), possibly from https://github.com/open-telemetry/opentelemetry-js-contrib/issues/2516, to test the scenario with large binary data there. Was it a Buffer? A Uint8Array? Some other type?
Perhaps the scrub could have a guard on instanceof Uint8Array (which also catches Buffer instances, at least in node 18), as is used here: https://github.com/open-telemetry/opentelemetry-js/blob/e97cc2ee8efa176232dc701462a49e9a10e95dfc/experimental/packages/otlp-transformer/src/common/internal.ts#L48
These document objects that we're scrubbing feel like the wildwest from various examples that I've seen.

Nov 21 '24 01:11 trentm

Any updates on this? I'm getting the exact same error when going through the _scrubStatement method. If I disable ApplicationInsights, it works.

@opentelemetry/instrumentation-mongodb [ Error running "dbStatementSerializer" hook, RangeError: Maximum call stack size exceeded, at Array.values () at F:\Development\Micway\PalletLoaderApi\node_modules@opentelemetry\instrumentation-mongodb\build\src\instrumentation.js:592:65

Apr 08 '25 20:04 rafaelgrilli92

Unassigned @dyladan as he likely won't have time to work on this. If anyone has time to work on this, please do so as this is a high-priority bug.

Sep 16 '25 12:09 pichlermarc

I can take it. Is a depth-limiter approach with a default of 10 good enough?

Sep 16 '25 19:09 rndD

I can take it.

@rndD thank you 🙌

Is a depth-limiter approach with a default of 10 good enough?

Yes, I think that's a good approach.

Sep 22 '25 14:09 pichlermarc

Hey, team! I see @rndD has already started working on this issue.

If it's not too late, I would suggest considering different approach we just implemented in our own prod:

type Replacer = (key: string, value: unknown) => unknown;

const getReplacer = (): Replacer => {
  // WeakSet is used for tracking seen objects to avoid memory leaks
  const seen = new WeakSet();
  return (_key, value) => {
    // undefined, boolean, number, bigint, string, symbol, function || null
    if (typeof value !== 'object' || !value) return '?';

    // objects (including arrays)
    if (seen.has(value)) return '[Circular]';
    seen.add(value);
    return value;
  };
};

export const dbStatementSerializer = (cmd: unknown): string => JSON.stringify(cmd, getReplacer());

This approach effectively handles circular references regardless of their nature and depth.

Additionally, compared to the current implementation, according to our tests, it is 10-60% faster depending on the input data: it iterates over the input keys only once, while the current implementation does it twice (the first time to sanitize the values and the second time to serialize the JSON). Also, it does not create new intermediate objects compared to the current implementation, so it puts less load on the GC.

Flaw: the code above will false-positively mark repeated non-circular references as circulars. For example:

const shared = { x: 1 };
const obj = { a: shared, b: shared };
dbStatementSerializer(obj); // {"a":{"x":"?"},"b":"[Circular]"}

In our prod, we've accepted this drawback as the price of high performance (it's possible to overcome this limitation, but it would be more expensive). However, in this repo the community may feel otherwise so it can be improved.

Sep 23 '25 22:09 masontikhonov

@masontikhonov thanks for a snippet, please check the #3095 PR.

Sep 24 '25 13:09 rndD

@pichlermarc we have 3 approves on the PR but no one from OTEL team =) . Can you review it?

Sep 25 '25 09:09 rndD

opentelemetry-js-contrib opentelemetry-js-contrib copied to clipboard

Circular reference in _scrubStatement causes maximum call stack exceeded in instrumentation-mongodb

What version of OpenTelemetry are you using?

What version of Node are you using?

What did you do?

What did you expect to see?

What did you see instead?

Additional context

opentelemetry-js-contrib
opentelemetry-js-contrib copied to clipboard