Raising the Web implementation limit on number of imports
The current implementation limit is 100k imports, but using wasm-split to split a debug build of a large real-world application produces a module with over 134k imports, one for each split out secondary function. It would be possible to reduce this to just a handful of imports using some extra indirection, but it would be nice to not need to. Could we raise the limit to at least 200k?
This would be fine with me.
Fine by me as well, FWIW.
(Wasmtime follows the Web limits, despite not being a Web engine, to maximize uniformity and compatibility across the ecosystem.)
@jakobkummerow WDYT?
No objections.
cc @rossberg and @dschuff. Assuming a CG vote is the next procedural step for this, I've added a vote to the July 16 CG meeting. Is there anything else you think we need to do first? Are we going to vote this directly to phase 4 or not use the phase process? Do we need implementations and tests before voting?
I'm okay with voting this in directly, though it would be nice to hear more about the application in question. Perhaps it's worth having a discussion about the limits and the criteria for changing them — an in promptu modification of the standard because of a single application could look like a powerful privilege that we would not grant everybody.
The application is photoshop, so raising this limit is required if they are to take our recommendation from the Pittsburgh meeting of doubling down on trying to do module splitting and lazy loading in user space.
No objections to raising the limit, but I'm a little surprised that the first limit being hit here is on imports. I thought that wasm-split causes deferred-loaded functions to be indirectly called. Is the point that each deferred function needs a different imported JS trampoline? Is there some alternative scheme with typed funcrefs that would avoid the need for this? If this would be the indirect scheme alluded to in the OP, are we sure that 100k imports on start-up plus Wasm->JS->Wasm indirection is more efficient than a Wasm->Wasm indirect call?
Also, is an analogous limit in danger of being hit on exports?
I didn't mean for the scheduling PR to close this issue :)
Is the point that each deferred function needs a different imported JS trampoline? Is there some alternative scheme with typed funcrefs that would avoid the need for this?
Yes, each secondary (i.e. deferred) function is currently replaced in the primary module with an imported placeholder function. All primary-to-secondary calls already go through the indirect call table, which is initialized with the imported placeholders in place of the secondary functions. The placeholder functions are provided by a proxy that loads and instantiates the secondary module, which as a side effect replaces all the placeholders in the table with the real secondary functions. All subsequent primary-to-secondary calls are therefore Wasm-to-Wasm indirect calls with no JS trampoline.
The placeholder proxy uses the import name of the called placeholder function to determine the table index of the corresponding secondary function, which is why each placeholder needs to be a separate import. The solution I had in mind to avoid this is to instead import a single placeholder function that takes the table index of the secondary function as an additional argument. The placeholder would no longer have to be a proxy, but all primary-to-secondary calls after the secondary module has been loaded would either have to be two indirect calls instead of one (the first to use the additional argument to dispatch to the second) or would have to trampoline through JS.
I might be misunderstanding the setup. I'd expect that each direct call - e.g. to function $foo - is transformed by picking a static index in the indirect call table where $foo will live instead - let's say 3 - and turning all call $foo into call_indirect (i32.const 3). The initial value at that slot in the table is a JS function which says "instantiate the module for $foo, insert it into slot 3 for future indirect calls, and then call $foo".
You need to create a separate JS function for each Wasm function you transform in this way as the "initial" table slot value, but I'd expect this could be done purely on the JS side (e.g. by having a generic function that takes a Wasm function name as its first argument, and bind-ing that parameter for each choice you need). I don't see where you need the extra information from separate static imports. Is the point that a static index in the table can't be determined ahead-of-time, so there needs to be additional dispatching/indirection? Or am I overlooking another complication of the Wasm->Wasm translation step?
Right, each placeholder is logically a separate function. In practice we use a Proxy, but bind-ing an index parameter to create separate functions would work, too.
If the JS "manually" inserted the placeholders into the table, that would be the end of the story, but we instead use active segments to set up the initial table to hold the placeholders, which requires that they are imported. (Incidentally, we also use active segments in the secondary module to automatically patch the table on instantiation, so the JS code never needs to touch the table at all.)
Ah, ok - it makes sense to need separate imports if active segments are being used. Has any testing been done about whether performing the initial table setup in JS would be competitive with this? IIUC you would save the 100k imports!
No, we haven't tested this. At minimum, we would need some scheme for telling the JS which table indices need placeholders. It would be a shame to give up the property that all the JS has to do is instantiate the modules and the rest is taken care of automatically.
At the CG meeting today we had unanimous consensus to raise the Web implementation limits on the number of imports and exports to one million.
Hi @tlively ... In the meeting there was mention of likely performance implications, though it didn't sound like anything anyone was too alarmed about. The exact nature of the cost I wasn't sure (entry creation?). Would this be a fix cost that every application pays or was that talk of a cost that is would only effects applications that need such a large number of imports (exports).
No, the performance concerns were about the normal, linear work to read and process the imports and exports. Modules that have more imports and exports than were previously possible will take more time to process them than was previously necessary, but there's nothing unexpected or superlinear going on. Nothing changes for existing modules.
As commented, the CG decided to raise the Web implementation limits on the number of imports and exports to one million. https://github.com/WebAssembly/spec/pull/1766 is tracking updating the spec.
See also https://github.com/WebAssembly/design/issues/1225 for discussion of optimizing the import section encoding, and the proposal repo.