node
node copied to clipboard
src,lib,buffer: improve atob / btoa performance
Extract modp_b64
dependence from Chromium to improve performance of atob()
and btoa()
.
Refs: https://github.com/chromium/chromium/tree/92.0.4490.1/third_party/modp_b64
Benchmark
- 12 Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
- MemTotal: 32657380 kB
Old logic
$ ./node benchmark/run.js --filter buffer-atob.js buffers
buffers/buffer-atob.js
buffers/buffer-atob.js n=32 len=65536: 487.7280162769471
$ ./node benchmark/run.js --filter buffer-btoa.js buffers
buffers/buffer-btoa.js
buffers/buffer-btoa.js n=32 len=67108864: 2.4594857665824494
Logic of this PR
$ ./node benchmark/run.js --filter buffer-atob.js buffers
buffers/buffer-atob.js
buffers/buffer-atob.js n=32 len=65536: 10,588.320156389489
$ ./node benchmark/run.js --filter buffer-btoa.js buffers
buffers/buffer-btoa.js
buffers/buffer-btoa.js n=32 len=67108864: 3.503510145848807
Despite the performance boost, I'm generally -1 on introducing a new dependency and a different codepath for base64 encoding/decoding only for atob/btoa. If anything, we should be looking at improving the base64/base64url encoding performance for
Buffer
. /cc @addaleax
I think I can try to use modp b64 in Buffer's Base64 if it's really faster.
This dep is extracted from Chromium.
According to https://github.com/nodejs/node/blob/master/doc/guides/contributing/pull-requests.md#dependencies policy, changes on deps/*
should be sent to upstream first. In that case, as Node.js already have their own implementation of base64 functionality, adding a new dependency for base64 solely doesn't seem to be the best effort to reduce the burden of collaborators. While the base64 codebase isn't quite large, are there any improvements that are worth to be made on Node.js own base64 implementation?
@addaleax @jasnell I've updated, removing modp_b64 and using base64_encode
. And got the new benchmark:
$ ./node benchmark/run.js --filter buffer-btoa.js buffers
buffers/buffer-btoa.js
buffers/buffer-btoa.js n=32 len=67108864: 3.3501380480392817
$ ./node benchmark/run.js --filter buffer-atob.js buffers
buffers/buffer-atob.js
buffers/buffer-atob.js n=32 len=65536: 6,966.175950365997
And I've found that modp_b64's decoding is faster then code in Node.js'.
Maybe we can merge this PR first (if possible), and then use modp_b64 to instead of Node.js' current Base64 code.
@XadillaX In order to merge this, I think we should:
- Remove the C++-backed atob and btoa implementations
- Remove their benchmarks
- Undo the stylistic changes that are happening to unrelated code
@XadillaX In order to merge this, I think we should:
- Remove the C++-backed atob and btoa implementations
- Remove their benchmarks
- Undo the stylistic changes that are happening to unrelated code
The C++-backed implementations is just fasten the function to resolve:
// TODO(@jasnell): The implementation here has not been performance
// optimized in any way.
And this is not a large change, just call base64_encode
directly to avoid do unnecessary Buffer
call.
The C++-backed implementations is just fasten the function to resolve:
Then let’s remove the TODO comment, or edit it to make clear that there’s not actually anything to do there.
Making these functions faster tells people that it’s okay to use them. It’s not, therefore let’s not worry about their performance at all.
@addaleax I've tried to use modp_b64 to instead of current and found the performance is almost the same (https://github.com/nodejs/node/compare/c975dff3c0f0f1ecb1574f3b10dd1d135a7704db...XadillaX:base64-perf).
So the way to improve this PR is just to make atob()
/ btoa()
using base64_encode
/ base64_decode
directly.
But as you said before, there's no need to improve them.
That means the only thing I should do is to remove @jasnell's TODO comment?
/cc @jasnell
We are thinking about using atob
and btoa
in https://github.com/feross/buffer/issues/339 for web compatibility.
We have seen quite quickly that the node
implementations are poorly optimised.
Although the buffer
module is directed at browsers, if user code unintentionally finds it's way to using the buffer
module in a node
environment and subsequently ends up using atob
or btoa
, this is only a loss for users of node, not our library.
Note that it is planned that simdutf could provide fast routines for base64 in the near future.
See https://github.com/nodejs/performance/issues/128
The simdutf library (already part of Node.js) will have accelerated base64 support in the next release. Decoding will follow the WHATWG forgiving-base64 specification so that ASCII white spaces are allowed in base64 content.
https://github.com/simdutf/simdutf/pull/375