dd-trace-js
dd-trace-js copied to clipboard
speed up shimmer by about 50x
The copyProperties function was doing an Object.setPrototypeOf, which is rather costly. We don't actually need this because all it was doing was acting as a stopgap in case we don't get all the properties from the original function. We're using Reflect.ownKeys() to get those properties, so this can't happen, therefore there's no need to set the prototype.
On a benchmark I whipped up separately for another project, I had noticed that shimmer + our instrumentations adds a significant amount of overhead. On a run with 100 iterations, I found that overhead to be about 10000%. When I ran the same benchmark after this change, I saw that overhead to be about 200%. To be fair, this benchmark is not a realistic application loading, and is explicitly clearing the cache and re-loading everything repeatedly, so I'm not sure how impactful this change will be. That said, we'll see what the benchmarks say.
Overall package size
Self size: 8.24 MB Deduped: 94.81 MB No deduping: 95.38 MB
Dependency sizes
| name | version | self size | total size | |------|---------|-----------|------------| | @datadog/libdatadog | 0.2.2 | 29.27 MB | 29.27 MB | | @datadog/native-appsec | 8.3.0 | 19.37 MB | 19.38 MB | | @datadog/native-iast-taint-tracking | 3.2.0 | 13.9 MB | 13.91 MB | | @datadog/pprof | 5.4.1 | 9.76 MB | 10.13 MB | | protobufjs | 7.2.5 | 2.77 MB | 5.16 MB | | @datadog/native-iast-rewriter | 2.6.0 | 2.58 MB | 2.72 MB | | @opentelemetry/core | 1.14.0 | 872.87 kB | 1.47 MB | | @datadog/native-metrics | 3.0.1 | 1.06 MB | 1.46 MB | | @opentelemetry/api | 1.8.0 | 1.21 MB | 1.21 MB | | import-in-the-middle | 1.11.2 | 112.74 kB | 826.22 kB | | msgpack-lite | 0.1.26 | 201.16 kB | 281.59 kB | | source-map | 0.7.4 | 226 kB | 226 kB | | opentracing | 0.14.7 | 194.81 kB | 194.81 kB | | lru-cache | 7.18.3 | 133.92 kB | 133.92 kB | | pprof-format | 2.1.0 | 111.69 kB | 111.69 kB | | @datadog/sketches-js | 2.1.0 | 109.9 kB | 109.9 kB | | semver | 7.6.3 | 95.82 kB | 95.82 kB | | lodash.sortby | 4.7.0 | 75.76 kB | 75.76 kB | | ignore | 5.3.1 | 51.46 kB | 51.46 kB | | int64-buffer | 0.1.10 | 49.18 kB | 49.18 kB | | shell-quote | 1.8.1 | 44.96 kB | 44.96 kB | | istanbul-lib-coverage | 3.2.0 | 29.34 kB | 29.34 kB | | rfdc | 1.3.1 | 25.21 kB | 25.21 kB | | @isaacs/ttlcache | 1.4.1 | 25.2 kB | 25.2 kB | | tlhunter-sorted-set | 0.1.0 | 24.94 kB | 24.94 kB | | limiter | 1.1.5 | 23.17 kB | 23.17 kB | | dc-polyfill | 0.1.4 | 23.1 kB | 23.1 kB | | retry | 0.13.1 | 18.85 kB | 18.85 kB | | jest-docblock | 29.7.0 | 8.99 kB | 12.76 kB | | crypto-randomuuid | 1.0.0 | 11.18 kB | 11.18 kB | | path-to-regexp | 0.1.12 | 6.6 kB | 6.6 kB | | koalas | 1.0.2 | 6.47 kB | 6.47 kB | | module-details-from-path | 1.0.3 | 4.47 kB | 4.47 kB |🤖 This report was automatically generated by heaviest-objects-in-the-universe
Benchmarks
Benchmark execution time: 2024-12-12 18:41:21
Comparing candidate commit 8172d517975f1c2d147f4844694e469996e6671e in PR branch bengl/faster-shimmer with baseline commit de0b516846fb812045367f26773a85717f5fbadd in branch master.
Found 0 performance improvements and 0 performance regressions! Performance is the same for 261 metrics, 5 unstable metrics.
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 92.13%. Comparing base (
1f2914f) to head (d0a1c31). Report is 4 commits behind head on master.
Additional details and impacted files
@@ Coverage Diff @@
## master #4633 +/- ##
==========================================
+ Coverage 86.40% 92.13% +5.72%
==========================================
Files 15 102 +87
Lines 581 3292 +2711
Branches 33 33
==========================================
+ Hits 502 3033 +2531
- Misses 79 259 +180
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
I think this was mostly added as a safety net, but I don't know the scenarios where the prototype would not be accessible by getOwnPropertyDescriptors. However, I cannot reproduce this performance gain at all, and running quick benchmark locally result in the exact same timing. My assumption is that there is something specific in your setup where maybe the wrapping stays in place and gets reapplied on top of the previous wrapping or something similar, which would explain the exponential impact of running the benchmark longer. I think it would be worth it to find exactly why this is happening just in case there is some other bug that this either this is fixing or that would be present in the benchmark. In any case, I don't think it's needed, so it should be safe to remove.
FWIW my benchmark was effectively:
const origRequire = Module.prototype.require
for (let i = 0; i < SOME_BIG_ENOUGH_NUMBER; i++) {
require('dd-trace/packages/datadog-instrumentations')
startTimer()
require('undici')
stopTime()
clearCache()
Module.prototype.require = origRequire
}
When I ran it with 0x, I saw that nearly all the time was being spent in setPrototypeOf.
I don't know the scenarios where the prototype would not be accessible by
getOwnPropertyDescriptors.
We don't even need the prototype since we're copying all the ownKeys() and these are all functions. On the off chance that we're wrapping a function whose prototype is not Function.prototype, then we'd have a problem. If you want, I can add a check for that and special-case it. It should be pretty quick.
If you want, I can add a check for that and special-case it.
If you don't think it's a valid real-world case I don't really mind. I was more interested to know why rerunning shimmer many times would cause an exponential delay in the benchmark, whereas running the function directly on a new object every time shows no improvement.