opentelemetry-js-contrib
opentelemetry-js-contrib copied to clipboard
Circular reference in _scrubStatement causes maximum call stack exceeded in instrumentation-mongodb
I am using mongoose which seems to have added a __parentArray, to an object that is part of an array. Hence there is a circular reference and so _scrubStatement then throws:
@opentelemetry/instrumentation-mongodb Error running dbStatementSerializer hook RangeError: Maximum call stack size exceeded at /Users/james/Library/CloudStorage/Dropbox/growthbookzip/growthbook/node_modules/@opentelemetry/instrumentation-mongodb/build/src/instrumentation.js:581:65 at Array.map (<anonymous>) at MongoDBInstrumentation._scrubStatement
This was introduced in instrumentation-mongodb.
You probably need to add a circular reference check. Or some hard coded check for __parentArray, since I'm assuming mongoose is pretty common.
What version of OpenTelemetry are you using?
"@opentelemetry/api": "^1.9.0",
"@opentelemetry/auto-instrumentations-node": "^0.47.1",
"@opentelemetry/exporter-metrics-otlp-proto": "^0.52.0",
"@opentelemetry/sdk-metrics": "^1.25.0"
What version of Node are you using?
v20.12.2
What did you do?
Ran some DB query that got automatically scrubbed.
My mongoose schema has a property phases that is defined as:
phases: [
{
_id: false,
dateStarted: Date,
dateEnded: Date,
phase: String,
name: String,
reason: String,
coverage: Number,
condition: String,
savedGroups: [
{
_id: false,
ids: [String],
match: String,
},
],
prerequisites: [
{
_id: false,
id: String,
condition: String,
},
],
namespace: {},
seed: String,
variationWeights: [Number],
groups: [String],
},
],
It probably was a create statement.
What did you expect to see?
No Errors.
What did you see instead?
The maximum call stack error above.
Additional context
None
@Fox32 Pinging you since I think you introduced it in your PR a week ago. Pinging @david-luna as well as you may have context here having approved it.
Thanks for your help!
Trying to wrap my head around what is happening. Can you provide an example? I wonder how MongoDB itself is handling that when serializing the statement.
@david-luna Yes. Sorry. I was trying at one point to make a simpler app to reproduce it but the api_1.context.active() was not being set for it. So I failed.
But yes it is reproducible if you checkout the ji/monitor_job_queue branch of the growthbook/growthbook repo.
Then get Mongo up and running with:
docker run -d -p 27017:27017 --name mongo \
-e MONGO_INITDB_ROOT_USERNAME=root \
-e MONGO_INITDB_ROOT_PASSWORD=password \
-v ~/gb_mongo_data/:/data/db \
mongo
then run yarn and yarn build and yarn start:with-tracing.
Go to localhost:3000
Create a new org
Go to http://localhost:3000/getstarted/experiment-guide
click "View sample experiment"
Click the three dots next to "stop experiment" in the upper right and click 'edit status' and change the dropdown to 'draft'.
Click "start experiment".
Click "start anyway"
then in the logs you should see the Maximum call stack size exceeded.
For what it's worth here is code within Mongoose that handles this for the spread operator on docs: https://github.com/Automattic/mongoose/blob/e4462c6c80b8b395187b382f7dcb3c398293cd78/lib/helpers/document/handleSpreadDoc.js#L5
ScrubStatement is not using the spread operator but perhaps similar code is necessary.
Is there a workaround on this?
Is there a workaround on this?
@alko89 We have a different issue with the newly introduced _scrubStatement and using a workaround by providing our own dbStatementSerializer which would no longer call that method.
Some thoughts on this after poking around a bit. Caveat: I don't use MongoDB or really know it that well.
-
semconv for mongodb spans (https://github.com/open-telemetry/semantic-conventions/blob/main/docs/database/mongodb.md) doesn't cover
db.query.text(the new name fordb.statement). The only reference is an example span wheredb.query.text: not set. -
General semconv discussion of semconv spans says to do what is being done by the scrub function (https://github.com/open-telemetry/semantic-conventions/blob/main/docs/database/database-spans.md):
The db.query.text SHOULD be collected by default only if there is sanitization that excludes sensitive information. Sanitization SHOULD replace all literals with a placeholder value. Such literals include, but are not limited to, String, Numeric, Date and Time, Boolean, Interval, Binary, and Hexadecimal literals.
-
It would, I suppose, be nice to skip document properties added by mongoose: https://github.com/Automattic/mongoose/blob/b34aba65bd64540e330665477f542eb79c877909/lib/helpers/document/compile.js#L213-L235 However, it would be good to understand if why the scrub function was iterating over
__parentArray, because the comment for that code in mongoose suggests it makes those property descriptorsenumerable = false. -
A depth-limiter on the scrubbed document sounds good. DD's impl limits to a depth of 10, FWIW. (If I read it correctly, DD's impl also skips arrays, so
collection.aggregate(...)usage wouldn't include a db statement.) -
It would be great to get a real code example (for tests), possibly from https://github.com/open-telemetry/opentelemetry-js-contrib/issues/2516, to test the scenario with large binary data there. Was it a
Buffer? AUint8Array? Some other type? -
Perhaps the scrub could have a guard on
instanceof Uint8Array(which also catches Buffer instances, at least in node 18), as is used here: https://github.com/open-telemetry/opentelemetry-js/blob/e97cc2ee8efa176232dc701462a49e9a10e95dfc/experimental/packages/otlp-transformer/src/common/internal.ts#L48 -
These document objects that we're scrubbing feel like the wildwest from various examples that I've seen.
Any updates on this? I'm getting the exact same error when going through the _scrubStatement method. If I disable ApplicationInsights, it works.
@opentelemetry/instrumentation-mongodb [
Error running "dbStatementSerializer" hook,
RangeError: Maximum call stack size exceeded, at Array.values (
Unassigned @dyladan as he likely won't have time to work on this. If anyone has time to work on this, please do so as this is a high-priority bug.
I can take it. Is a depth-limiter approach with a default of 10 good enough?
I can take it.
@rndD thank you 🙌
Is a depth-limiter approach with a default of 10 good enough?
Yes, I think that's a good approach.
Hey, team! I see @rndD has already started working on this issue.
If it's not too late, I would suggest considering different approach we just implemented in our own prod:
type Replacer = (key: string, value: unknown) => unknown;
const getReplacer = (): Replacer => {
// WeakSet is used for tracking seen objects to avoid memory leaks
const seen = new WeakSet();
return (_key, value) => {
// undefined, boolean, number, bigint, string, symbol, function || null
if (typeof value !== 'object' || !value) return '?';
// objects (including arrays)
if (seen.has(value)) return '[Circular]';
seen.add(value);
return value;
};
};
export const dbStatementSerializer = (cmd: unknown): string => JSON.stringify(cmd, getReplacer());
This approach effectively handles circular references regardless of their nature and depth.
Additionally, compared to the current implementation, according to our tests, it is 10-60% faster depending on the input data: it iterates over the input keys only once, while the current implementation does it twice (the first time to sanitize the values and the second time to serialize the JSON). Also, it does not create new intermediate objects compared to the current implementation, so it puts less load on the GC.
Flaw: the code above will false-positively mark repeated non-circular references as circulars. For example:
const shared = { x: 1 };
const obj = { a: shared, b: shared };
dbStatementSerializer(obj); // {"a":{"x":"?"},"b":"[Circular]"}
In our prod, we've accepted this drawback as the price of high performance (it's possible to overcome this limitation, but it would be more expensive). However, in this repo the community may feel otherwise so it can be improved.
@masontikhonov thanks for a snippet, please check the #3095 PR.
@pichlermarc we have 3 approves on the PR but no one from OTEL team =) . Can you review it?