dd-trace-js
dd-trace-js copied to clipboard
fix child_process propagation issues
What does this PR do?
- disables the
enterargument tostartSpan()which then disables thestorage.enterWith({ ...store, span })call
Motivation
- otherwise it appears that sometimes unrelated spans become child spans of a child process operation
Overall package size
Self size: 6.98 MB Deduped: 58.21 MB No deduping: 58.49 MB
Dependency sizes
| name | version | self size | total size | |------|---------|-----------|------------| | @datadog/native-appsec | 8.0.1 | 15.59 MB | 15.6 MB | | @datadog/native-iast-taint-tracking | 3.1.0 | 12.27 MB | 12.28 MB | | @datadog/pprof | 5.3.0 | 9.85 MB | 10.22 MB | | protobufjs | 7.2.5 | 2.77 MB | 5.16 MB | | @datadog/native-iast-rewriter | 2.4.1 | 2.14 MB | 2.23 MB | | @opentelemetry/core | 1.14.0 | 872.87 kB | 1.47 MB | | @datadog/native-metrics | 2.0.0 | 898.77 kB | 1.3 MB | | @opentelemetry/api | 1.8.0 | 1.21 MB | 1.21 MB | | import-in-the-middle | 1.8.1 | 71.67 kB | 785.15 kB | | msgpack-lite | 0.1.26 | 201.16 kB | 281.59 kB | | opentracing | 0.14.7 | 194.81 kB | 194.81 kB | | pprof-format | 2.1.0 | 111.69 kB | 111.69 kB | | @datadog/sketches-js | 2.1.0 | 109.9 kB | 109.9 kB | | semver | 7.6.3 | 95.82 kB | 95.82 kB | | lodash.sortby | 4.7.0 | 75.76 kB | 75.76 kB | | lru-cache | 7.14.0 | 74.95 kB | 74.95 kB | | ignore | 5.3.1 | 51.46 kB | 51.46 kB | | int64-buffer | 0.1.10 | 49.18 kB | 49.18 kB | | shell-quote | 1.8.1 | 44.96 kB | 44.96 kB | | istanbul-lib-coverage | 3.2.0 | 29.34 kB | 29.34 kB | | tlhunter-sorted-set | 0.1.0 | 24.94 kB | 24.94 kB | | limiter | 1.1.5 | 23.17 kB | 23.17 kB | | dc-polyfill | 0.1.4 | 23.1 kB | 23.1 kB | | retry | 0.13.1 | 18.85 kB | 18.85 kB | | jest-docblock | 29.7.0 | 8.99 kB | 12.76 kB | | crypto-randomuuid | 1.0.0 | 11.18 kB | 11.18 kB | | path-to-regexp | 0.1.7 | 6.78 kB | 6.78 kB | | koalas | 1.0.2 | 6.47 kB | 6.47 kB | | module-details-from-path | 1.0.3 | 4.47 kB | 4.47 kB |🤖 This report was automatically generated by heaviest-objects-in-the-universe
Benchmarks
Benchmark execution time: 2024-08-26 19:56:59
Comparing candidate commit 43b04898ff25fa45e41cfd1ea1c711a53f82ff87 in PR branch bengl/fix-child-process with baseline commit 1f2914f9fcb241c3d11b554826957724c3ba8d16 in branch master.
Found 0 performance improvements and 0 performance regressions! Performance is the same for 260 metrics, 6 unstable metrics.
In master when I run a basic app with a child process call followed by an HTTP call:
clear ; DD_TRACE_DEBUG=true node --loader=../loader-hook.mjs app.mjs 2>&1 | grep Encoding
I get this output:
[{"trace_id":"44d9b5b7977dd091","span_id":"44d9b5b7977dd091","parent_id":"0000000000000000","name":"command_execution","resource":"sh","error":0,"meta":{"_dd.p.tid":"66ce0c8700000000","_dd.p.dm":"-0","service":"chproc","version":"1.0.0","runtime-id":"03086a11-772c-4693-8e2c-a945eb628549","component":"subprocess","cmd.shell":"ls","cmd.exit_code":"0","language":"javascript"},"metrics":{"_dd.agent_psr":1,"_dd.top_level":1,"process_id":62408,"_sampling_priority_v1":1},"start":1724779655856016600,"duration":12429199,"links":[],"service":"chproc","type":"system"}]
[{"trace_id":"44d9b5b7977dd091","span_id":"4996e7d34bcb4c82","parent_id":"44d9b5b7977dd091","name":"http.request","resource":"GET","error":0,"meta":{"_dd.p.tid":"66ce0c8700000000","_dd.p.dm":"-0","service":"chproc","version":"1.0.0","runtime-id":"03086a11-772c-4693-8e2c-a945eb628549","component":"fetch","span.kind":"client","http.method":"GET","http.url":"http://thomashunter.name/","out.host":"thomashunter.name","http.status_code":"200","language":"javascript"},"metrics":{"_dd.measured":1,"process_id":62408,"_sampling_priority_v1":1},"start":1724779655871527000,"duration":72391357,"links":[],"service":"chproc","type":"http"}]
However with this one liner change applied I get this output:
[{"trace_id":"18e62dd21fb82128","span_id":"18e62dd21fb82128","parent_id":"0000000000000000","name":"http.request","resource":"GET","error":0,"meta":{"_dd.p.tid":"66ce0ccf00000000","_dd.p.dm":"-0","service":"chproc","version":"1.0.0","runtime-id":"e1a23175-eec4-4a47-96d6-358ecc4d5112","component":"fetch","span.kind":"client","http.method":"GET","http.url":"http://thomashunter.name/","out.host":"thomashunter.name","http.status_code":"200","language":"javascript"},"metrics":{"_dd.agent_psr":1,"_dd.top_level":1,"_dd.measured":1,"process_id":62584,"_sampling_priority_v1":1},"start":1724779727588001800,"duration":67388428,"links":[],"service":"chproc","type":"http"}]
So it seems that the change in this PR entirely prevents the child process calls from being recorded.
All of these tests pass with the current implementation while other tests fail with the change. Due to that and the age of the draft, I'll close this. I guess we should look into this separately again.