node
node copied to clipboard
src,lib: reducing C++ calls of esm legacy main resolve
This PR is related to https://github.com/nodejs/performance/issues/73
Legacy Main Resolve
@anonrig raised that we could benefit from the implementation of legacyMainResolve on the C++ side, so here's the result so far:
confidence improvement accuracy (*)
esm/esm-legacyMainResolve.js resolvedFile='node_modules/non-exist' packageConfigMain='./index.js' packageJsonUrl='node_modules/test/package.json' n=10000 *** 190.93 % ±4.55%
esm/esm-legacyMainResolve.js resolvedFile='node_modules/non-exist' packageConfigMain='' packageJsonUrl='node_modules/test/package.json' n=10000 *** 188.08 % ±2.97%
esm/esm-legacyMainResolve.js resolvedFile='node_modules/test/index.js' packageConfigMain='./index.js' packageJsonUrl='node_modules/test/package.json' n=10000 *** 191.18 % ±4.22%
esm/esm-legacyMainResolve.js resolvedFile='node_modules/test/index.js' packageConfigMain='' packageJsonUrl='node_modules/test/package.json' n=10000 *** 190.32 % ±2.34%
esm/esm-legacyMainResolve.js resolvedFile='node_modules/test/index.json' packageConfigMain='./index.js' packageJsonUrl='node_modules/test/package.json' n=10000 *** 189.64 % ±3.61%
esm/esm-legacyMainResolve.js resolvedFile='node_modules/test/index.json' packageConfigMain='' packageJsonUrl='node_modules/test/package.json' n=10000 *** 186.56 % ±3.51%
esm/esm-legacyMainResolve.js resolvedFile='node_modules/test/index.node' packageConfigMain='./index.js' packageJsonUrl='node_modules/test/package.json' n=10000 *** 191.34 % ±3.79%
esm/esm-legacyMainResolve.js resolvedFile='node_modules/test/index.node' packageConfigMain='' packageJsonUrl='node_modules/test/package.json' n=10000 *** 188.11 % ±2.44%
(**) (***)
esm/esm-legacyMainResolve.js resolvedFile='node_modules/non-exist' packageConfigMain='./index.js' packageJsonUrl='node_modules/test/package.json' n=10000 ±6.19% ±8.36%
esm/esm-legacyMainResolve.js resolvedFile='node_modules/non-exist' packageConfigMain='' packageJsonUrl='node_modules/test/package.json' n=10000 ±4.04% ±5.44%
esm/esm-legacyMainResolve.js resolvedFile='node_modules/test/index.js' packageConfigMain='./index.js' packageJsonUrl='node_modules/test/package.json' n=10000 ±5.75% ±7.77%
esm/esm-legacyMainResolve.js resolvedFile='node_modules/test/index.js' packageConfigMain='' packageJsonUrl='node_modules/test/package.json' n=10000 ±3.16% ±4.21%
esm/esm-legacyMainResolve.js resolvedFile='node_modules/test/index.json' packageConfigMain='./index.js' packageJsonUrl='node_modules/test/package.json' n=10000 ±4.92% ±6.63%
esm/esm-legacyMainResolve.js resolvedFile='node_modules/test/index.json' packageConfigMain='' packageJsonUrl='node_modules/test/package.json' n=10000 ±4.77% ±6.41%
esm/esm-legacyMainResolve.js resolvedFile='node_modules/test/index.node' packageConfigMain='./index.js' packageJsonUrl='node_modules/test/package.json' n=10000 ±5.17% ±6.99%
esm/esm-legacyMainResolve.js resolvedFile='node_modules/test/index.node' packageConfigMain='' packageJsonUrl='node_modules/test/package.json' n=10000 ±3.31% ±4.43%
Be aware that when doing many comparisons the risk of a false-positive
result increases. In this case, there are 8 comparisons, you can thus
expect the following amount of false-positive results:
0.40 false positives, when considering a 5% risk acceptance (*, **, ***),
0.08 false positives, when considering a 1% risk acceptance (**, ***),
0.01 false positives, when considering a 0.1% risk acceptance (***)
To be able to run the benchmark, first build the
old
node using this branch.
About the benchmark, I needed to create another benchmark just for this function to be able to see the improvements since the results of esm-defaultResolver.js was not reflecting the real impact of this change.
Potential optimizations
legacyMainResolve
also calls fileURLToPath
and didn't have any version on the C++ side, so I needed to rewrite this function.
I haven't benchmarked the JS version yet, but I think we can add some benchmarks and see if we can replace the JS version with the C++ version, I've tried to make the code as easy as possible to support exposing on the JS side in the future.
Tasks
- [x] Rewrite legacyMainResolve
- [x] Rewrite fileURLToPath
- [x] Add benchmark for legacyMainResolve
- [x] Add tests for legacyMainResolve
Acknowledgments
Huge thanks to @anonrig for the support and early review of this code and @RafaelGSS for the hints about the benchmark and permission model.
Review requested:
- [ ] @nodejs/loaders
- [ ] @nodejs/modules
legacyMainResolve
also callsfileURLToPath
and didn't have any version on the C++ side, so I needed to rewrite this function.
Maybe we should first test/benchmark and land the fileURLToPath
rewrite, which would simplify this PR. wdyt?
@aduh95 Here's the benchmark for the rewrite of fileURLToPath
:
IMPORTANT: Those values are not related to this PR, those numbers are only specific to the branch linked above the benchmark.
confidence improvement accuracy (*) (**) (***)
url/file-url-to-path.js n=100000 url='(url) file:///home/user/test/index.js' *** -95.88 % ±3.25% ±4.44% ±6.03%
url/file-url-to-path.js n=100000 url='(url) file:///home/user/test%20index.js' *** -95.19 % ±2.98% ±4.07% ±5.52%
url/file-url-to-path.js n=100000 url='(url) file:///home/user/test%2Findex.js' *** 8.66 % ±2.71% ±3.67% ±4.90%
url/file-url-to-path.js n=100000 url='(url) file://google.com/home/user/test%2Findex.js' 0.68 % ±2.75% ±3.72% ±4.99%
url/file-url-to-path.js n=100000 url='(url) http://google.com/home/test.js' *** -11.02 % ±2.27% ±3.05% ±4.03%
url/file-url-to-path.js n=100000 url='file:///home/user/test/index.js' *** 97.59 % ±7.74% ±10.45% ±13.92%
url/file-url-to-path.js n=100000 url='file:///home/user/test%20index.js' *** 86.08 % ±7.51% ±10.15% ±13.52%
url/file-url-to-path.js n=100000 url='file:///home/user/test%2Findex.js' *** 195.47 % ±3.95% ±5.35% ±7.14%
url/file-url-to-path.js n=100000 url='file://google.com/home/user/test%2Findex.js' *** 166.80 % ±3.19% ±4.29% ±5.66%
url/file-url-to-path.js n=100000 url='http://google.com/home/test.js' *** 145.18 % ±4.26% ±5.76% ±7.68%
url/file-url-to-path.js n=100000 url='not-even-a-url' *** 168.09 % ±7.25% ±9.87% ±13.32%
Be aware that when doing many comparisons the risk of a false-positive
result increases. In this case, there are 11 comparisons, you can thus
expect the following amount of false-positive results:
0.55 false positives, when considering a 5% risk acceptance (*, **, ***),
0.11 false positives, when considering a 1% risk acceptance (**, ***),
0.01 false positives, when considering a 0.1% risk acceptance (***)
Branch: https://github.com/nodejs/node/compare/main...h4ad-forks:node:perf/file-url-to-path
In resume, if we want to expose the C++ function, we should enforce the caller to always pass a string instead of URL.
@H4ad I meant to have it as a separate PR.
@aduh95 Yeah, my idea was to show you if you really want to do that with the performance benefits from it.
To be honest, in this form, without refactoring the whole ESM to avoid new URL
and manipulate the data directly as a string, we will not see a performance improvement, it will be the complete opposite.
I think is better to go in the opposite direction, merge this one and then evaluate the benefits of exposing fileURLToPath
, because as I said, will probably lead to a refactoring not only in the function itself but also in the places where they call to take the fastest path, which is manipulating directly the string.
url/file-url-to-path.js n=100000 url=’(url) file:///home/user/test/index.js’ *** -95.88 % ±3.25% ±4.44% ±6.03%
This seems to show that in the most common case of a file://
URL, this is 96% slower?
@GeoffreyBooth Exactly, the way the fileURLToPath
is used today, simply don't worth exposing the C++ version.
@GeoffreyBooth Exactly, the way the
fileURLToPath
is used today, simply don’t worth exposing the C++ version.
Okay, so then should we close this PR? Is there something else of value being proposed here?
This PR is not about fileURLToPath
, is about LegacyMainResolve
.
The improvements for LegacyMainResolve
is about 60~70%.
The rewrite of fileURLToPath
was only because it was needed by LegacyMainResolve
. Aduh suggested that maybe we could create a PR for fileURLToPath
but as we've seen, it's not worth it, that's it.
Also, the metrics of https://github.com/nodejs/node/pull/48325#issuecomment-1575649208 is not related to this PR itself, are related to the branch I link in the comment, so those numbers don't reflect this PR since we don't have any overhead of calling C++ function from C++ side.
The CI for Test Asan is failing due some memory issue:
==88318==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7ffd9ad085f8 at pc 0x00000431f995 bp 0x7ffd9ad07570 sp 0x7ffd9ad07568
READ of size 16 at 0x7ffd9ad085f8 thread T0
#0 0x431f994 in ada::url_aggregator ada::parser::parse_url<ada::url_aggregator>(std::basic_string_view<char, std::char_traits<char> >, ada::url_aggregator const*) (/home/runner/work/node/node/out/Release/node+0x431f994)
#1 0x4312e13 in tl::expected<ada::url_aggregator, ada::errors> ada::parse<ada::url_aggregator>(std::basic_string_view<char, std::char_traits<char> >, ada::url_aggregator const*) (/home/runner/work/node/node/out/Release/node+0x4312e13)
#2 0xfc3b97 in node::fs::LegacyMainResolve(v8::FunctionCallbackInfo<v8::Value> const&) (/home/runner/work/node/node/out/Release/node+0xfc3b97)
#3 0x1779bab in v8::internal::MaybeHandle<v8::internal::Object> v8::internal::(anonymous namespace)::HandleApiCallHelper<false>(v8::internal::Isolate*, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::FunctionTemplateInfo>, v8::internal::Handle<v8::internal::Object>, unsigned long*, int) (/home/runner/work/node/node/out/Release/node+0x1779bab)
#4 0x17779b4 in v8::internal::Builtin_HandleApiCall(int, unsigned long*, v8::internal::Isolate*) (/home/runner/work/node/node/out/Release/node+0x17779b4)
#5 0x3402e35 in Builtins_CEntry_Return1_ArgvOnStack_BuiltinExit (/home/runner/work/node/node/out/Release/node+0x3402e35)
Address 0x7ffd9ad085f8 is located in stack of thread T0 at offset 1304 in frame
#0 0xfc364f in node::fs::LegacyMainResolve(v8::FunctionCallbackInfo<v8::Value> const&) (/home/runner/work/node/node/out/Release/node+0xfc364f)
Ref: https://github.com/nodejs/node/actions/runs/5189308542/jobs/9354128411?pr=48325
Does anyone know what could be the cause?
I tried to reproduce it locally but I couldn't, so I have no idea how to fix it.
CI: https://ci.nodejs.org/job/node-test-pull-request/52176/
CI: https://ci.nodejs.org/job/node-test-pull-request/52261/
CI is failing https://ci.nodejs.org/job/node-test-binary-armv7l/RUN_SUBSET=js,nodes=ubuntu2004-armv7l/6181/consoleFull but with an error that I think is not related with this PR:
not ok 3740 sequential/test-perf-hooks
01:20:33 ---
01:20:33 duration_ms: 618.48400
01:20:33 severity: fail
01:20:33 exitcode: 1
01:20:33 stack: |-
01:20:33 {
01:20:33 name: 'node',
01:20:33 entryType: 'node',
01:20:33 startTime: 0,
01:20:33 duration: { around: 404.7198510169983 },
01:20:33 nodeStart: { around: 0 },
01:20:33 v8Start: { around: 0 },
01:20:33 bootstrapComplete: { around: 404.6602909564972, delay: 2500 },
01:20:33 environment: { around: 0 },
01:20:33 loopStart: -1,
01:20:33 loopExit: -1
01:20:33 }
01:20:33 node:assert:399
01:20:33 throw err;
01:20:33 ^
01:20:33
01:20:33 AssertionError [ERR_ASSERTION]: environment: 250.5410280227661 >= 250
01:20:33 at checkNodeTiming (/home/iojs/build/workspace/node-test-binary-armv7l/test/sequential/test-perf-hooks.js:31:7)
01:20:33 at Object.<anonymous> (/home/iojs/build/workspace/node-test-binary-armv7l/test/sequential/test-perf-hooks.js:43:1)
01:20:33 at Module._compile (node:internal/modules/cjs/loader:1255:14)
01:20:33 at Module._extensions..js (node:internal/modules/cjs/loader:1309:10)
01:20:33 at Module.load (node:internal/modules/cjs/loader:1113:32)
01:20:33 at Module._load (node:internal/modules/cjs/loader:960:12)
01:20:33 at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:83:12)
01:20:33 at node:internal/main/run_main_module:23:47 {
01:20:33 generatedMessage: false,
01:20:33 code: 'ERR_ASSERTION',
01:20:33 actual: false,
01:20:33 expected: true,
01:20:33 operator: '=='
01:20:33 }
01:20:33
01:20:33 Node.js v21.0.0-pre
Any hints about what I should do?
Also, I updated the benchmark, with the latest changes, the improvement is almost 200%.
But one thing that is making me a little skeptical about this number is this benchmark:
confidence improvement accuracy (*) (**) (***)
esm/esm-loader-defaultResolve.js specifier='./relative-existing.js' n=10000 4.83 % ±6.85% ±9.20% ±12.15%
esm/esm-loader-defaultResolve.js specifier='./relative-nonexistent.js' n=10000 1.95 % ±4.28% ±5.74% ±7.56%
esm/esm-loader-defaultResolve.js specifier='node:os' n=10000 -1.98 % ±5.33% ±7.17% ±9.47%
esm/esm-loader-defaultResolve.js specifier='node:prefixed-nonexistent' n=10000 -3.48 % ±5.01% ±6.71% ±8.83%
esm/esm-loader-defaultResolve.js specifier='unprefixed-existing' n=10000 -0.32 % ±4.83% ±6.47% ±8.51%
esm/esm-loader-defaultResolve.js specifier='unprefixed-nonexistent' n=10000 0.16 % ±1.99% ±2.67% ±3.52%
Be aware that when doing many comparisons the risk of a false-positive
result increases. In this case, there are 6 comparisons, you can thus
expect the following amount of false-positive results:
0.30 false positives, when considering a 5% risk acceptance (*, **, ***),
0.06 false positives, when considering a 1% risk acceptance (**, ***),
0.01 false positives, when considering a 0.1% risk acceptance (***)
Only unprefixed-existing
calls legacyMainResolve
, and we don't see the performance reflected in this benchmark.
I wrote a little benchmark and we came from 36k
op/s to 63k
op/s, which is a real improvement, so does it worth introducing 400+ lines for this amount of improvement?
I added some tests to ensure the correct resolution of legacyMainResolve
, this also helped me catch some edge cases related to the errors.
I just added some good paths and check about the errors, and I didn't add any test to ensure the order of the resolution (if has index.json and index.js, should return index.js first). If you think is necessary, I can add those tests too.
Failed to start CI
- Validating Jenkins credentials ✘ Jenkins credentials invalidhttps://github.com/nodejs/node/actions/runs/5305353243
CI: https://ci.nodejs.org/job/node-test-pull-request/52478/
@H4ad it’s really hard to keep track of what you’ve changed when you force push on a single commit everytime, and makes this PR hard to merge. I recommend using git commit fixup or just add a new commit for every fix.
@anonrig Sorry, I thought it was better to just keep a single commit, that's why I always push force, but I'll be using fixup for future fixes.
CI: https://ci.nodejs.org/job/node-test-pull-request/52522/
CI: https://ci.nodejs.org/job/node-test-pull-request/52528/
Benchmark CI: https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/1342/
Results
confidence improvement accuracy (*)
esm/esm-legacyMainResolve.js resolvedFile='node_modules/non-exist' packageConfigMain='./index.js' packageJsonUrl='node_modules/test/package.json' n=10000 *** 27.52 % ±6.03%
esm/esm-legacyMainResolve.js resolvedFile='node_modules/non-exist' packageConfigMain='' packageJsonUrl='node_modules/test/package.json' n=10000 *** 30.58 % ±6.83%
esm/esm-legacyMainResolve.js resolvedFile='node_modules/test/index.js' packageConfigMain='./index.js' packageJsonUrl='node_modules/test/package.json' n=10000 *** 25.63 % ±6.15%
esm/esm-legacyMainResolve.js resolvedFile='node_modules/test/index.js' packageConfigMain='' packageJsonUrl='node_modules/test/package.json' n=10000 *** 20.21 % ±7.75%
esm/esm-legacyMainResolve.js resolvedFile='node_modules/test/index.json' packageConfigMain='./index.js' packageJsonUrl='node_modules/test/package.json' n=10000 *** 27.66 % ±6.96%
esm/esm-legacyMainResolve.js resolvedFile='node_modules/test/index.json' packageConfigMain='' packageJsonUrl='node_modules/test/package.json' n=10000 *** 29.09 % ±7.64%
esm/esm-legacyMainResolve.js resolvedFile='node_modules/test/index.node' packageConfigMain='./index.js' packageJsonUrl='node_modules/test/package.json' n=10000 *** 23.78 % ±6.45%
esm/esm-legacyMainResolve.js resolvedFile='node_modules/test/index.node' packageConfigMain='' packageJsonUrl='node_modules/test/package.json' n=10000 *** 21.65 % ±6.65%
Be aware that when doing many comparisons the risk of a false-positive
result increases. In this case, there are 8 comparisons, you can thus
expect the following amount of false-positive results:
0.40 false positives, when considering a 5% risk acceptance (*, **, ***),
0.08 false positives, when considering a 1% risk acceptance (**, ***),
0.01 false positives, when considering a 0.1% risk acceptance (***)
CI: https://ci.nodejs.org/job/node-test-pull-request/52585/
Landed in a2a8e31cbc441cd2b513227be09d90568aa77b5a
This commit does not land cleanly on v18.x-staging
and will need manual backport in case we want it in v18.
@ruyadorno Hey, can I help with something?
@H4ad does https://github.com/nodejs/node/blob/main/doc/contributing/backporting-to-release-lines.md#how-to-submit-a-backport-pull-request help?
@RaisinTen Thanks, I'll try to backport by the end of this day.