wasm-bindgen
wasm-bindgen copied to clipboard
Dependent module support
I noticed dependent module support are two separate things: Import handling, and linked modules. Their implementations are separated into different commits in this branch.
Import handling and linked modules are both very useful. The combination of these features can be used to implement nearly all possible approaches of WASM integration into workers and worklets (pure JS worker, shared WASM module with message passing without threads, shared WASM memory to run threads). For pure JS workers, linked modules would be sufficient. This feature is needed by widely-used crates like wasm-bindgen-futures. Like wasm-bindgen-futures, many crates currently do percent-encode their scripts into data URLs or create blobs, but this is a workaround for poor support from wasm_bindgen. It breaks other workarounds like bundling resulting files manually after the build. Also, it breaks origin-based CSP. If data URLs or blob URLs make sense, those could be created by wasm_bindgen::link_to! too, consistently and controllable (by patching or configuring wasm_bindgen), instead of every crate trying to do this on their own.
This is a proof-of-concept. It is already usable, but there are a lot of remaining issues, like unclean argument parsing and that it is only tested on --target web.
What do you think about this, @alexcrichton, @Liamolucko?
I would also like to get either linked modules or import handling merged, if one of those is easier.
Fixes #3019 Fixes #2335 (Just create a script element or dynamically import your linked module before using JS classes from Rust)
Sorry but this is in the same boat for me as https://github.com/rustwasm/wasm-bindgen/pull/3069 where I don't have the time/energy to review major changes like this.
@Liamolucko, in https://github.com/rustwasm/wasm-bindgen/pull/3168#issuecomment-1467550188 you say that you want to support the feature provided by this PR somehow. This wasn't clear to me, so that was encouraging.
This draft has some problems: It adds a big dependency and it cannot compile_error! on JS parser errors, because that could break existing code. Silently failing could break new projects, so that would be bad too. Also, parsing ES modules recursively would prevent #2787, because there is no const fn JS parser and const fn cannot read additional files. Additionally, parsing JS code in wasm_bindgen can feel too magical.
An option to solve these issues is JS parsing in another crate, that uses wasm-bindgen-macro-support as an interface to wasm_bindgen. wasm_bindgen then needs to expose a simple interface that could also be used directly. For example, wasm_bindgen could allow InlineJsPart arrays as inline_js parameters:
enum InlineJsPart<'a> {
Raw(&'a str),
EsModuleShim,
}
// example:
wasm_bindgen::link_to!(inline_js = [
InlineJsPart::Raw("import * as bindgen from '"), InlineJsPart::EsModuleShim, InlineJsPart::Raw("';
registerProcessor('WasmProcessor', class WasmProcessor extends AudioWorkletProcessor {
constructor(options) {
super();
let [module, memory, handle] = options.processorOptions;
bindgen.initSync(module, memory);
this.processor = bindgen.WasmAudioProcessor.unpack(handle);
}
process(inputs, outputs) {
return this.processor.process(outputs[0][0]);
}
});"),
])
Additionally, setting both module and inline_js attributes should place the code of inline_js in the file with the name of module. This way, a JS parsing crate could pass an InlineJsPart array to inline_js and still preserve the original file name.
This option needs minimal work inside wasm_bindgen, but has greater overall complexity than the original draft. If we stick to the original approach, we should make the JS parsing a Cargo feature and enable it explicitly by specifying another attribute (parsed_module) attribute instead of module, so wasm_bindgen can error on invalid code.
No matter which approach we choose, wasm_bindgen should support multiple shims (https://github.com/rustwasm/wasm-bindgen/pull/3247#issuecomment-1468303476), so the linked module can choose the expected shim format by InlineJsPart or the specific import (for example, import * from '$wbg_es_module_shim').
@Liamolucko, what do you think?
An option to solve these issues is JS parsing in another crate, that uses
wasm-bindgen-macro-supportas an interface to wasm_bindgen. wasm_bindgen then needs to expose a simple interface that could also be used directly. For example, wasm_bindgen could allow InlineJsPart arrays as inline_js parameters:enum InlineJsPart<'a> { Raw(&'a str), EsModuleShim, } // example: wasm_bindgen::link_to!(inline_js = [ InlineJsPart::Raw("import * as bindgen from '"), InlineJsPart::EsModuleShim, InlineJsPart::Raw("'; registerProcessor('WasmProcessor', class WasmProcessor extends AudioWorkletProcessor { constructor(options) { super(); let [module, memory, handle] = options.processorOptions; bindgen.initSync(module, memory); this.processor = bindgen.WasmAudioProcessor.unpack(handle); } process(inputs, outputs) { return this.processor.process(outputs[0][0]); } });"), ])Additionally, setting both
moduleandinline_jsattributes should place the code ofinline_jsin the file with the name ofmodule. This way, a JS parsing crate could pass an InlineJsPart array to inline_js and still preserve the original file name.This option needs minimal work inside wasm_bindgen, but has greater overall complexity than the original draft. If we stick to the original approach, we should make the JS parsing a Cargo feature and enable it explicitly by specifying another attribute (
parsed_module) attribute instead ofmodule, so wasm_bindgen can error on invalid code.
That's an interesting idea, but I think most of the problems you mentioned can be solved without farming things out to other crates.
The idea I had for the syntax of picking a shim was something along the lines of link_to!(..., shim = "esm"). If we go with this syntax, it means that whether or not the linked module needs access to the shim URL is known up front, and if it doesn't, there's no need to parse the JavaScript. This dodges the potential breakage of JS parser errors triggering compile errors in code that used to work, since that code won't specify shim = "..." and so the JS won't get parsed.
As for the large swc dependency and lack of const-generated inline_js support, I think those can be solved by moving JS parsing to the CLI. This means that users won't have to build swc, as it'll already be baked in. That could potentially increase the CLI's binary size by quite a bit, though.
It doesn't solve the const issue, but it reduces it to the same difficulty as for regular snippets, since the macro doesn't need to parse the JS anymore and so doesn't need to know what it contains.
Side-note about potentially reducing the size of the JS parsing dependency: we don't really need a full JS parser to simply replace "$wbg_main" strings. You only really need a lexer for that, simplifying things a bunch. But that also leads to a potential round 2 of issues with JS parsing errors if we ever need a JS parser for other reasons in the future, so that's probably not a great idea.
Also, parsing ES modules recursively...
Huh, I only just noticed now that this PR makes all of the dependencies of linked modules (and regular snippets) get included as snippets too.
The fact that this applies to normal snippets too is a potential breaking change. Previously, writing import "/foo.js" in a snippet would result in that import getting resolved at runtime to https://<hostname>/foo.js, whereas after this PR it'll get resolved by wasm_bindgen to <crate dir>/foo.js.
Do you think there's much need for this? I always imagined linked modules being pretty small wrappers that just served to get the wasm module up and running, which wouldn't have any need for dependencies (other than the shim).
...so the linked module can choose the expected shim format by
InlineJsPartor the specific import (for example,import * from '$wbg_es_module_shim').
Hm, that's an interesting way of doing it. As I mentioned above, I was imagining something more like link_to!(..., shim = "esm"). I feel like that works better, since as I mentioned it provides a solution to the parsing problem. Your approach does allow having the same linked module reference multiple shims, but I'm not sure why you'd need that.