gl-matrix WebAssembly Port

So I successfully converted most of mat4 to Wasm, but discovered a few problems on my road.

E.g. mat4.create() returns an Float32Array, but we cannot send arrays to Wasm directly - we can only send addresses.

I solved it by creating a bridge:

mat4.create = function() {
  let address = mat4.create(); // wasm call, returns internal address
  let view = memory.subarray(address >> 3, (address >> 3) + 16);
  view.address = address;
  return view;
};

This allows to use the API as default, but also adds some overhead.

mat4.create needs to be bridged anyway, but methods like mat4.multiply could be used without a bridge by letting the user directly pass in the addresses:

let a = mat4.create();
let b = mat4.create();
mat4.multiply(a.address, a.address, b.address); // directly calls wasm

This is faster than with a js bridge, but on the other side affects the API conventions.

Another problem is that Wasm doesn't support garbage collection right now, so users are required to free data manually. Having a mat4.create in a draw loop possibly kills your app by time.

Feb 24 '18 21:02 maierfelix

Test repository is up here

Mar 30 '18 17:03 maierfelix

@maierfelix If you could change the API conventions to solve your problems, what sort of changes would you make? Would that even solve your problems?

Apr 26 '18 18:04 stefnotch

Right now glmw has to be instantiated asynchronous (due to a Wasm limitation), see here
There is no garbage collection, spamming *.create in a draw loop is an app's death
To read/write data you have to create views on your data, see here. The only difference is basically that you no longer have arrays in the common sense, but numeric values representing the arrays memory location inside the Wasm module. Creating views then allows to get an readable/writable array of your data.

The current API is fine, these limitations are most entirely Wasm related. The first two limitations will likely get resolved in the future.

Apr 26 '18 19:04 maierfelix

2 years later It looks like WebAssembly is still not really ready for this. Garbage collecting is still a feature proposal. https://github.com/WebAssembly/proposals

(Technically, you can implement a horrible version of GC using weak references and regularly checking if they still exist. If not, clear the associated WebAssembly object)

Jun 05 '20 18:06 stefnotch

In general the thing I'd pay attention to the most is the overhead of jumping from JS to WASM and back again. The browsers all do what they can to minimize it, but there's always going to be a bit more work involved than making a JS->JS call or WASM->WASM. What this means in practice is that if you have a whole bunch of little isolated operations then they'll almost always be better off staying in JavaScript even if WASM can technically execute the same operations faster. If, however, you have a large batch of work to do (for example: computing all the world matrices for a scene graph in one big pass) then it may make sense to toss that workload over to a well-optimized WASM block.

Unfortunately that kind of workload balancing is difficult to communicate and enforce in an isolated library like this one.

Jun 05 '20 18:06 toji

@toji Judging from this post https://hacks.mozilla.org/2018/10/calls-between-javascript-and-webassembly-are-finally-fast-%F0%9F%8E%89/ , I was under the impression that the Javascript to Wasm overhead shouldn't be too much of a concern.

Jun 05 '20 18:06 stefnotch

An interesting benchmark would be the upcoming SIMD instructions

Jun 05 '20 19:06 maierfelix

@stefnotch: It's entirely possible that I have old data in mind, and I know that it's been an area of ongoing work. I do still think it's worth paying attention to if any porting efforts are made, though, because the scope of each function tends to be so small and low overhead != no overhead. I could easily see a situation where vec3.add() ends up significantly slower if you bounce to WASM simply because there's not much work being done in the actual function. By the same logic mat4.multiply() could break even and mat4.invert() may be faster. Depends on how much boost WASM gives speed wise (SIMD would definitely help there) and how much the actual call overhead is (remember that you'll need to copy or store the vector in a WASM-friendly array). You should also consider additional load time overhead, since starting up a WASM module isn't free. It'll probably differ between browsers too.

None of this is to say it's not worth trying. I've just seen a lot of devs look as WASM as a magic performance bullet, and it's a lot more nuanced than that.

Jun 05 '20 19:06 toji

WeakRefs and finalizers could possibly be used in place of WebAssembly garbage collection. While the spec doesn't say that a finalizer has to be called, the V8 team says

...it's safe to assume that engines will garbage collect, and finalization callbacks will be called at some later time, unless the environment is discarded... -- https://v8.dev/features/weak-references

The only thing I'd worry about here is the potential performance impact such a solution.

Sep 06 '20 18:09 stefnotch

Quick update here: WebAssembly SIMD has shipped https://webassembly.org/roadmap/ This might be important enough to warrant some serious API resdesigns.

Aug 03 '21 07:08 stefnotch

Another issue I've found in the past is that wasm memory can't actually shrink. This means that once you've allocated a large chunk of memory, you can free it to make it reusable, but the wasm memory doesn't get smaller when freeing. Wasm's memory can only grow, but not shrink.

Aug 04 '21 09:08 maierfelix

The addition of SIMD to WASM is interesting, but I'm still wary of performance pitfalls when communicating between JS/WASM. There was a good blog post from the Babylon.js team recently talking about why they won't be porting to WASM wholesale, and it specifically calls out that "WASM should not be called for small chunk of work", linking to this very relevant thread as part of their reasoning. And the numbers quoted there reflect my personal experience as well.

I like WASM! In it's current state it's really good for cases where you are dispatching a large amount of work in a single call. (Decompress this file/transcode this texture/advance the world physics state one step). Unfortunately glMatrix is built around doing relatively small operations, so it's going to be extremely hard to break even unless the JS<->WASM overhead gets a lot lower.

Aug 09 '21 16:08 toji

I agree, a few years ago there was this post regarding how mozilla managed to reduced the overhead of trampoline calls between Javascript and Wasm. I don't know if this actually ever got shipped or how other browsers tackle this issue.

Anyways I came to the same conclusion, Wasm is much better to be used for isolated performance intensive tasks of a codebase. Recently I started porting parts of my codebase over to Assemblyscript and gl-matrix would be quite handy to have there. Also, Assemblyscript supports SIMD since a few years already, so it might be worth a shot?

Aug 11 '21 09:08 maierfelix

I agree, a few years ago there was this post regarding how mozilla managed to reduced the overhead of trampoline calls between Javascript and Wasm. I don't know if this actually ever got shipped or how other browsers tackle this issue.

I've been wondering about that myself; does anyone have any updates?

Jan 10 '23 05:01 arcman7

gl-matrix gl-matrix copied to clipboard

WebAssembly Port

gl-matrix
gl-matrix copied to clipboard