Type-safe Embind port, compatible with the current API
This PR moves the JS wrapper to C++ by utilizing Embind. It aims at automating the process of binding as much as possible. Ideally, I want to get rid of any wrappers and automatically generate compatible bindings by wrapping the necessary methods in lambdas. If this succeeds, it may be possible in future to adapt API to make it more natural (including breaking changes).
Fixes https://github.com/WebAssembly/binaryen/issues/7560#issuecomment-2839523686
@kripken For some reason I get Maximum call stack size exceeded when importing the module after it has been built. Do you know why is that happening? Setting --stack-size to 2000 solves the problem.
For some reason I get Maximum call stack size exceeded when importing the module
Hmm, strange. What is the full stack trace when you get that error? That could help understand if this is during the VM's compilation of the code, or in the JS itself as it executes during startup.
cc @brendandahl - are there known issues with embind there?
@kripken It seems like Node.JS fails to load the script:
node:internal/modules/cjs/loader:1427
const result = compileFunctionForCJSLoader(content, filename, false /* is_sea_main */, shouldDetectModule);
^
RangeError: Maximum call stack size exceeded
at wrapSafe (node:internal/modules/cjs/loader:1427:18)
at Module._compile (node:internal/modules/cjs/loader:1449:20)
at Module._extensions..js (node:internal/modules/cjs/loader:1588:10)
at Module.load (node:internal/modules/cjs/loader:1282:32)
at Module._load (node:internal/modules/cjs/loader:1098:12)
at TracingChannel.traceSync (node:diagnostics_channel:315:14)
at wrapModuleLoad (node:internal/modules/cjs/loader:215:24)
at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:158:5)
at node:internal/main/run_main_module:30:49
The call stack is different when type is set to module:
node:internal/modules/esm/utils:337
const wrap = new ModuleWrap(url, undefined, source, 0, 0, hostDefinedOption);
^
RangeError: Maximum call stack size exceeded
at compileSourceTextModule (node:internal/modules/esm/utils:337:16)
at ModuleLoader.moduleStrategy (node:internal/modules/esm/translators:164:18)
at callTranslator (node:internal/modules/esm/loader:439:14)
at ModuleLoader.moduleProvider (node:internal/modules/esm/loader:445:30)
at async ModuleJob._link (node:internal/modules/esm/module_job:106:19)
It seems like the error is coming not from the count of stack frames, but instead from the size of stack. Increasing --stack-size also fixes the problem. The resulting module is about 16 Mb. Is the problem coming from the size of the bundle?
Looks like Node is failing to load and compile the code, yeah. How big is the code? Maybe it's just the size or the amount of nesting.
If this is a debug build, perhaps try an optimized one? Linking with -O1 might help, or even -O2 --profiling (optimized, but profiling prevents full JS minification, so the JS should be somewhat readable).
I've built the project in debug mode, and got 3 million lines. Is this ok? Is the size of bundle coming from embind?
Binaryen is a pretty big project with lots of templated c++, so it seems reasonable the debug output is going to be large.
It's just strange to me that the crashes appeared when using Embind.
I've found the root of the problem. I removed the expression wrappers and now it loads just fine. But what do we do now? Expression wrappers are crucial.
The difference in size between the bundle with expression wrappers and without them is ~300Kb. So I don't think that the problem is in size.
After several hours of painful debugging, I've found the issue. It turns out that Emscripten doesn't like string operations (capitalize) at top-level. So I need to find a workaround for that.
It's now fixed!
Interesting that fix helps, I have no idea why... but maybe it was just close to some limit?
Btw, was this with binaryen.js or binaryen.wasm?
As much as I would like to maintain backward compatibility, embind does not provide such opportunities to leave the external interface unchanged, just as it was with the JS wrapper. And if the full compatibility is not possible, is the partial compatibility even needed?
And if the full compatibility is not possible, is the partial compatibility even needed?
I think that if we can get embind working nicely here, then the long-term benefits for both users and for maintainers would justify a breaking change. So I would say we don't need partial compatibility, necessarily, though where possible it would be nice to remain consistent and familiar.
Btw, was this with binaryen.js or binaryen.wasm?
The issue was with the JS version. I've changed the code a little and now it's back, even with the new capitalization function.
Some more magic hackery that I don't even understand the difference in, and it works again! This probably needs to be reworked by someone who knows where the problem comes from later.
How to turn off CI? I get my email spammed on each commit 😄
I think that declaring instruction factories can be simplified. Each instruction-building function is declared three times: in the header, in the CPP file and in the bindings. Can this be automated a little?
About CI, try pushing commits with [ci skip] in the title.
I think that declaring instruction factories can be simplified.
Which factories do you mean specifically? (wasm-building, the C API, or something else?)
I mean the wrappers/bindings (binaryen::Module::block, binaryen::Module::if_, etc.)
I mean the wrappers/bindings (binaryen::Module::block, binaryen::Module::if_, etc.)
I see, good question. I'd hope there's a way to automate that with embind but I'm not sure. @brendandahl , any ideas?
@GulgDev is there anything you're stuck on in this PR?
@brendandahl Yes, I don't want to continue on this for now because the method I chose to bind instruction-creating functions seems too verbose. I've tried some other things, but they didn't help the situation.