habitat-sim icon indicating copy to clipboard operation
habitat-sim copied to clipboard

[Web] Add flag to enable SIMD instructions in WASM

Open Skylion007 opened this issue 4 years ago • 4 comments

Motivation and Context

  • Adds a flag that can enable building the Web build of Habitat to use SIMD. This should speed up physics and other linalg heavy vectorization hopefully ~quite significantly~. The browser must support SIMD in WASM like modern versions of Chrome though.
  • To enable pass the --simd flag to build_js.sh
  • Also fixes a small bug with pre-commit that prevented shell check from running on build.sh and build_js.sh .

How Has This Been Tested

  • Locally and with CI.

Types of changes

  • [X] Docs change / refactoring / dependency upgrade
  • [X] New feature (non-breaking change which adds functionality)

Checklist

  • [X] My code follows the code style of this project.
  • [X] I have read the CONTRIBUTING document.
  • [X] I have completed my CLA (see CONTRIBUTING)
  • [X] I have added tests to cover my changes.
  • [X] All new and existing tests passed.

Skylion007 avatar Jul 08 '21 18:07 Skylion007

quite significantly

(With my skeptical hat on.) Do you have some numbers to back this? I'm interested in how much this helps in a codebase of this size.

mosra avatar Jul 08 '21 18:07 mosra

@mosra Probably not on Magnum, but WASM doesn't even enable SSE2 instructions without this flag for Bullet so it should give a speed up there.

Regardless, point taken retracting my claim slightly.

Skylion007 avatar Jul 08 '21 18:07 Skylion007

It's not as simple as "enabling SSE2" since WASM has to work on ARM as well -- and that's why I'm skeptical, because different platforms have different instructions and what could directly map to a SSE instruction might have to be emulated on NEON and vice versa.

But in any case I really want to know how this helps, seriously :) Did you try it out? I know from certain projects that hand-coded WASM SIMD can be four to six times times faster than scalar code, but have no idea about autovectorization, especially when combined with everything else we're running here. Is it 1%? 10%? 2x faster?

mosra avatar Jul 08 '21 18:07 mosra

I tried this on my webxr hand demo benchmark, which drops a lot of objects in a big pile and tries to step physics 60 times per second (or 16.7 ms per stepWorld() call). I repeated the benchmark 3 times with and without the --simd flag and here were the results:

without --simd: 78.76ms 72.60ms 82.87ms

with --simd: 79.75ms 84.07ms 83.48ms

These numbers are the average ms between stepWorld() calls. Note that it's trying to achieve 16.67ms per stepWorld() call but cannot keep up. So it doesn't seem like this SIMD optimization has helped much in this case.

ldcWV avatar Jul 15 '21 19:07 ldcWV