dd-trace-js icon indicating copy to clipboard operation
dd-trace-js copied to clipboard

fix child_process propagation issues

Open bengl opened this issue 1 year ago • 3 comments

What does this PR do?

  • disables the enter argument to startSpan() which then disables the storage.enterWith({ ...store, span }) call

Motivation

  • otherwise it appears that sometimes unrelated spans become child spans of a child process operation

bengl avatar Jul 19 '24 21:07 bengl

Overall package size

Self size: 6.98 MB Deduped: 58.21 MB No deduping: 58.49 MB

Dependency sizes | name | version | self size | total size | |------|---------|-----------|------------| | @datadog/native-appsec | 8.0.1 | 15.59 MB | 15.6 MB | | @datadog/native-iast-taint-tracking | 3.1.0 | 12.27 MB | 12.28 MB | | @datadog/pprof | 5.3.0 | 9.85 MB | 10.22 MB | | protobufjs | 7.2.5 | 2.77 MB | 5.16 MB | | @datadog/native-iast-rewriter | 2.4.1 | 2.14 MB | 2.23 MB | | @opentelemetry/core | 1.14.0 | 872.87 kB | 1.47 MB | | @datadog/native-metrics | 2.0.0 | 898.77 kB | 1.3 MB | | @opentelemetry/api | 1.8.0 | 1.21 MB | 1.21 MB | | import-in-the-middle | 1.8.1 | 71.67 kB | 785.15 kB | | msgpack-lite | 0.1.26 | 201.16 kB | 281.59 kB | | opentracing | 0.14.7 | 194.81 kB | 194.81 kB | | pprof-format | 2.1.0 | 111.69 kB | 111.69 kB | | @datadog/sketches-js | 2.1.0 | 109.9 kB | 109.9 kB | | semver | 7.6.3 | 95.82 kB | 95.82 kB | | lodash.sortby | 4.7.0 | 75.76 kB | 75.76 kB | | lru-cache | 7.14.0 | 74.95 kB | 74.95 kB | | ignore | 5.3.1 | 51.46 kB | 51.46 kB | | int64-buffer | 0.1.10 | 49.18 kB | 49.18 kB | | shell-quote | 1.8.1 | 44.96 kB | 44.96 kB | | istanbul-lib-coverage | 3.2.0 | 29.34 kB | 29.34 kB | | tlhunter-sorted-set | 0.1.0 | 24.94 kB | 24.94 kB | | limiter | 1.1.5 | 23.17 kB | 23.17 kB | | dc-polyfill | 0.1.4 | 23.1 kB | 23.1 kB | | retry | 0.13.1 | 18.85 kB | 18.85 kB | | jest-docblock | 29.7.0 | 8.99 kB | 12.76 kB | | crypto-randomuuid | 1.0.0 | 11.18 kB | 11.18 kB | | path-to-regexp | 0.1.7 | 6.78 kB | 6.78 kB | | koalas | 1.0.2 | 6.47 kB | 6.47 kB | | module-details-from-path | 1.0.3 | 4.47 kB | 4.47 kB |

🤖 This report was automatically generated by heaviest-objects-in-the-universe

github-actions[bot] avatar Jul 19 '24 21:07 github-actions[bot]

Benchmarks

Benchmark execution time: 2024-08-26 19:56:59

Comparing candidate commit 43b04898ff25fa45e41cfd1ea1c711a53f82ff87 in PR branch bengl/fix-child-process with baseline commit 1f2914f9fcb241c3d11b554826957724c3ba8d16 in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 260 metrics, 6 unstable metrics.

pr-commenter[bot] avatar Jul 19 '24 21:07 pr-commenter[bot]

In master when I run a basic app with a child process call followed by an HTTP call:

clear ; DD_TRACE_DEBUG=true node --loader=../loader-hook.mjs app.mjs 2>&1 | grep Encoding

I get this output:

[{"trace_id":"44d9b5b7977dd091","span_id":"44d9b5b7977dd091","parent_id":"0000000000000000","name":"command_execution","resource":"sh","error":0,"meta":{"_dd.p.tid":"66ce0c8700000000","_dd.p.dm":"-0","service":"chproc","version":"1.0.0","runtime-id":"03086a11-772c-4693-8e2c-a945eb628549","component":"subprocess","cmd.shell":"ls","cmd.exit_code":"0","language":"javascript"},"metrics":{"_dd.agent_psr":1,"_dd.top_level":1,"process_id":62408,"_sampling_priority_v1":1},"start":1724779655856016600,"duration":12429199,"links":[],"service":"chproc","type":"system"}]
[{"trace_id":"44d9b5b7977dd091","span_id":"4996e7d34bcb4c82","parent_id":"44d9b5b7977dd091","name":"http.request","resource":"GET","error":0,"meta":{"_dd.p.tid":"66ce0c8700000000","_dd.p.dm":"-0","service":"chproc","version":"1.0.0","runtime-id":"03086a11-772c-4693-8e2c-a945eb628549","component":"fetch","span.kind":"client","http.method":"GET","http.url":"http://thomashunter.name/","out.host":"thomashunter.name","http.status_code":"200","language":"javascript"},"metrics":{"_dd.measured":1,"process_id":62408,"_sampling_priority_v1":1},"start":1724779655871527000,"duration":72391357,"links":[],"service":"chproc","type":"http"}]

However with this one liner change applied I get this output:

[{"trace_id":"18e62dd21fb82128","span_id":"18e62dd21fb82128","parent_id":"0000000000000000","name":"http.request","resource":"GET","error":0,"meta":{"_dd.p.tid":"66ce0ccf00000000","_dd.p.dm":"-0","service":"chproc","version":"1.0.0","runtime-id":"e1a23175-eec4-4a47-96d6-358ecc4d5112","component":"fetch","span.kind":"client","http.method":"GET","http.url":"http://thomashunter.name/","out.host":"thomashunter.name","http.status_code":"200","language":"javascript"},"metrics":{"_dd.agent_psr":1,"_dd.top_level":1,"_dd.measured":1,"process_id":62584,"_sampling_priority_v1":1},"start":1724779727588001800,"duration":67388428,"links":[],"service":"chproc","type":"http"}]

So it seems that the change in this PR entirely prevents the child process calls from being recorded.

tlhunter avatar Aug 27 '24 17:08 tlhunter

All of these tests pass with the current implementation while other tests fail with the change. Due to that and the age of the draft, I'll close this. I guess we should look into this separately again.

BridgeAR avatar Jul 07 '25 23:07 BridgeAR