node
node copied to clipboard
fs: improve `ExistsSync` performance on Windows
Simplifies code by using std::filesystem. Since std::filesystem is C++17, this pull-request can be backported to 18 and 20.
The following benchmark shows 2% improvement, but I'm sure if we had a Windows benchmark-ci machine it would have better results. Especially for symlinks, we are now not executing 2 different operating-system calls, and only one.
FYI: std::filesystem::exists follows symlinks.
Benchmark code:
const fs = require('node:fs');
const exists = fs.existsSync(__filename);
console.log(exists);
Benchmark result:
PS C:\Users\yagiz\Desktop\coding\node> hyperfine ".\out\Release\node .\exists-bench.js" " .\node-pr .\exists-bench.js" --warmup 10
Benchmark 1: .\out\Release\node .\exists-bench.js
Time (mean ± σ): 48.8 ms ± 2.9 ms [User: 2.2 ms, System: 3.3 ms]
Range (min … max): 41.6 ms … 55.3 ms 56 runs
Benchmark 2: .\node-pr .\exists-bench.js
Time (mean ± σ): 47.7 ms ± 2.0 ms [User: 1.3 ms, System: 2.8 ms]
Range (min … max): 42.2 ms … 50.5 ms 60 runs
Summary
.\node-pr .\exists-bench.js ran
1.02 ± 0.07 times faster than .\out\Release\node .\exists-bench.js
cc @nodejs/fs @nodejs/platform-windows
Have you run the benchmarks under the benchmark folder? It’s not very reliable to benchmark a single fs call within a script using hyperfine since Node.js itself can take a lot more time to start up than doing that one fs call. (Perhaps around 40ms of that 48-47ms can be Node.js startup)
I agree. Unfortunately, I couldn't figure out some weird npm issue on my Windows machine, and couldn't make a progress due to the slowness of the benchmarks (even if I decreased the n).
I've opened an issue to keep track of the Windows issue I'm facing: https://github.com/nodejs/node/issues/53538
CI: https://ci.nodejs.org/job/node-test-pull-request/59919/
I agree. Unfortunately, I couldn't figure out some weird npm issue on my Windows machine, and couldn't make a progress due to the slowness of the benchmarks (even if I decreased the n).
Why would you need npm access to run the benchmarks? I think they can simply be run as
out/Release/node benchmark/compare.js --old ./node_main --new out/Release/node --filter existsSync fs > fs.csv
?
Oh I see in https://github.com/nodejs/node/issues/53538 you were trying to run node-benchmark-compare. You could just send that over to any machine that can run node-benchmark-compare to analyze the data, the csv is portable.
(Or you could run the R version with data.csv | Rscript benchmark/compare.R though I imagine getting Rscript to work on Windows can also be quite a task....or maybe not?)
Microbenchmark CI https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/1571/
Microbenchmark CI https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/1571/
It seems, this PR is 15% faster
It seems, this PR is 15% faster
That only applies to the non-existing part. The question for me is what is more likely: existing or non-existing?
confidence improvement accuracy (*) (**) (***)
fs/bench-existsSync.js n=1000000 type='existing' *** -0.92 % ±0.29% ±0.39% ±0.51%
fs/bench-existsSync.js n=1000000 type='non-existing' *** 15.26 % ±1.16% ±1.55% ±2.04%
fs/bench-existsSync.js n=1000000 type='non-flat-existing' *** 2.50 % ±0.39% ±0.53% ±0.69%
That only applies to the non-existing part. The question for me is what is more likely: existing or non-existing?
@BridgeAR for unix, the difference is 15% but on Windows, but on windows, it reduces the number of OS calls.
CI: https://ci.nodejs.org/job/node-test-pull-request/59952/
Did you test this hypothesis, or is it guaranteed by
std::filesystem::exists?
I was referencing the cppreference (https://en.cppreference.com/w/cpp/filesystem/exists)
Did you test this hypothesis, or is it guaranteed by
std::filesystem::exists?I was referencing the cppreference (https://en.cppreference.com/w/cpp/filesystem/exists)
I don't see anything in that reference that would guarantee that the standard library function makes a single system call (or the Windows equivalent) and not multiple system calls.
CI: https://ci.nodejs.org/job/node-test-pull-request/59969/
CI: https://ci.nodejs.org/job/node-test-pull-request/59984/
CI: https://ci.nodejs.org/job/node-test-pull-request/59985/
CI: https://ci.nodejs.org/job/node-test-pull-request/60177/