nbind icon indicating copy to clipboard operation
nbind copied to clipboard

Native is much slower than asmjs

Open arcanis opened this issue 9 years ago • 4 comments

The following PR lists the changes I had to make to a package to switch it to nbind (from nan). As you can see in the C source codes, I haven't made any critical change, merely changing a few function names and adding a const qualifier. However, when I tried to benchmark it, I noticed that perfs weren't as good as expected:

  • Pre-PR execution time: 3s
  • Post-PR execution time (native): 12s
  • Post-PR execution time (asmjs): 4s

As you can see, something weird is going on with the native version, which is much, much slower than even the asmjs version. I have no idea what might cause this: I checked the flags that were used when compiling it, and the -O3 flag was there without a doubt.

Have you noticed this odd disparity before?

If you want to try by yourself, you can clone [email protected]:arcanis/buffer-offset-index and run npm run install && npm run bench (assuming emcc is in your PATH).

arcanis avatar Dec 12 '16 13:12 arcanis

Looks like it might have to do with converting vectors or passing them by value instead of by reference in the nbind version. I'll check this further...

jjrv avatar Dec 12 '16 17:12 jjrv

Actually, there is a change that might be a valid cause for the huge difference in perfs:

https://github.com/atom/buffer-offset-index/pull/2/files#diff-a2d4f7ee8b35a4f4edffbf257fe50ccdL56

As you can see, I had to remove a ref-qualifier, which probably means that some extra copies are required. Unfortunately, using a reference on this function triggers a segfault on the native version, and yield a wrong result on the asmjs version.

arcanis avatar Dec 12 '16 17:12 arcanis

It seems the benchmarks aren't easily comparable because the representation of points is different. Now that the bug you reported earlier is fixed, try commenting out these lines:

https://github.com/arcanis/buffer-offset-index/blob/7ac2f8301be4bb7a671f369121d6d4d84d13d22d/src/entry-common.js#L36-L42

Still not comparable, but again different. You can also comment out the assert.deepEqual in both your PR and the original master, and then compare again.

jjrv avatar Dec 12 '16 19:12 jjrv

I can't run reasonable benchmarks at the moment because this laptop is randomly changing clock speeds to keep temperatures down.

jjrv avatar Dec 12 '16 19:12 jjrv