jsbi
jsbi copied to clipboard
Why is it slower than native BigInt?
Some benchmarks like the one at https://peterolson.github.io/BigInteger.js/benchmark/#:~:text=Multiplication:%20400%20digits shows that the native BigInt is 10 times faster. This is very interesting why... because modern js engines are very good. Is it possible to have the same speed with js library?
Generally speaking, the key difference is access to wide machine instructions. A JS engine on x64 hardware can use 64bit×64bit→128bit multiplications. In JavaScript, you can use either Math.imul()
for a 16bit×16bit→32bit multiplication, or convert to doubles and do (at most) 26bit×26bit→52bit multiplications; in both cases the limiting factor is the precision of the result. Due to how multiplication of a larger chunk decomposes into piecewise multiplications of smaller chunks, you need 16 multiplications of size 16×16→32 to replace one of size 64×64→128, plus a bunch of shifts and additions of the intermediate results.
Using smarter algorithms (which we have plans to do in V8, but it's low priority because there isn't really a demand for it aside from benchmarks) doesn't fundamentally change the picture, because e.g. Karatsuba-multiplication still recursively calls the simple implementation as base case, so ultimately those basic "single-digit" multiplication operations are always the deciding factor. If we implemented Karatsuba multiplication in JSBI before engines support it natively, then temporarily JSBI would be faster than engines for large enough numbers. On the other hand, now that all modern browsers ship BigInt support, there is less and less need for JSBI, so it's unclear whether spending effort on making it faster is worth it.
That said, in the specific case you linked to, I'm seeing 12M ops/s for JSBI, and 2B ops/s for "Yaffle BigInteger" (which, I guess, uses native BigInts when available?), so that's even a 100x difference, which is suspiciously large. Usually when some microbenchmark reports more than a few hundred million ops/s, it means that the optimizing compiler was able to drop the operation entirely, and you're measuring an empty loop. So I wouldn't trust the results here; not without verifying that the benchmark actually did perform and measure all intended operations.
@jakobkummerow , thanks for the answer. So, it is interesting, that even WebAssembly has no access to 64x64->128 bit multiplication (and calls from JS to WebAssembly still has too much overhead relative to the cost of the multiplication).
Karatsuba multiplication
Right, the question is about the sizes less than 2000 bits, where "schoolbook algorithms" are still faster, btw, Karatsuba can be implemented on the top of the native implementation...
I see the difference in my application as well if I switch between native BigInt and JSBI, so may be the benchmark is right.
even WebAssembly has no access to 64x64->128 bit multiplication
64×64→128 bit multiplications are not portable. x86_64 has them, many other architectures don't.
calls from JS to WebAssembly still has too much overhead relative to the cost of the multiplication
Yes, probably. I haven't tried it. Have you?
Once WasmGC is standardized and available, that likely will enable interesting ways of implementing something like BigInt (or BigDecimal, etc) as a Wasm library, with performance that should come very close to what a native implementation could do. That'll still take a while though.
calls from JS to WebAssembly still has too much overhead relative to the cost of the multiplication
Yes, probably. I haven't tried it. Have you?
The last time I tried - it was ~20 multiplications. https://webassembly.studio/?f=mk8h2xgeuv Hm.... looks like it is ~10 multiplications today... very good.