design icon indicating copy to clipboard operation
design copied to clipboard

Raising the Web implementation limit on number of imports

Open tlively opened this issue 1 year ago • 16 comments

The current implementation limit is 100k imports, but using wasm-split to split a debug build of a large real-world application produces a module with over 134k imports, one for each split out secondary function. It would be possible to reduce this to just a handful of imports using some extra indirection, but it would be nice to not need to. Could we raise the limit to at least 200k?

tlively avatar Jun 27 '24 01:06 tlively

This would be fine with me.

eqrion avatar Jun 28 '24 16:06 eqrion

Fine by me as well, FWIW.

(Wasmtime follows the Web limits, despite not being a Web engine, to maximize uniformity and compatibility across the ecosystem.)

fitzgen avatar Jun 28 '24 16:06 fitzgen

@jakobkummerow WDYT?

tlively avatar Jun 28 '24 22:06 tlively

No objections.

jakobkummerow avatar Jul 01 '24 08:07 jakobkummerow

cc @rossberg and @dschuff. Assuming a CG vote is the next procedural step for this, I've added a vote to the July 16 CG meeting. Is there anything else you think we need to do first? Are we going to vote this directly to phase 4 or not use the phase process? Do we need implementations and tests before voting?

tlively avatar Jul 01 '24 14:07 tlively

I'm okay with voting this in directly, though it would be nice to hear more about the application in question. Perhaps it's worth having a discussion about the limits and the criteria for changing them — an in promptu modification of the standard because of a single application could look like a powerful privilege that we would not grant everybody.

rossberg avatar Jul 01 '24 14:07 rossberg

The application is photoshop, so raising this limit is required if they are to take our recommendation from the Pittsburgh meeting of doubling down on trying to do module splitting and lazy loading in user space.

tlively avatar Jul 01 '24 15:07 tlively

No objections to raising the limit, but I'm a little surprised that the first limit being hit here is on imports. I thought that wasm-split causes deferred-loaded functions to be indirectly called. Is the point that each deferred function needs a different imported JS trampoline? Is there some alternative scheme with typed funcrefs that would avoid the need for this? If this would be the indirect scheme alluded to in the OP, are we sure that 100k imports on start-up plus Wasm->JS->Wasm indirection is more efficient than a Wasm->Wasm indirect call?

Also, is an analogous limit in danger of being hit on exports?

conrad-watt avatar Jul 02 '24 15:07 conrad-watt

I didn't mean for the scheduling PR to close this issue :)

Is the point that each deferred function needs a different imported JS trampoline? Is there some alternative scheme with typed funcrefs that would avoid the need for this?

Yes, each secondary (i.e. deferred) function is currently replaced in the primary module with an imported placeholder function. All primary-to-secondary calls already go through the indirect call table, which is initialized with the imported placeholders in place of the secondary functions. The placeholder functions are provided by a proxy that loads and instantiates the secondary module, which as a side effect replaces all the placeholders in the table with the real secondary functions. All subsequent primary-to-secondary calls are therefore Wasm-to-Wasm indirect calls with no JS trampoline.

The placeholder proxy uses the import name of the called placeholder function to determine the table index of the corresponding secondary function, which is why each placeholder needs to be a separate import. The solution I had in mind to avoid this is to instead import a single placeholder function that takes the table index of the secondary function as an additional argument. The placeholder would no longer have to be a proxy, but all primary-to-secondary calls after the secondary module has been loaded would either have to be two indirect calls instead of one (the first to use the additional argument to dispatch to the second) or would have to trampoline through JS.

tlively avatar Jul 02 '24 20:07 tlively

I might be misunderstanding the setup. I'd expect that each direct call - e.g. to function $foo - is transformed by picking a static index in the indirect call table where $foo will live instead - let's say 3 - and turning all call $foo into call_indirect (i32.const 3). The initial value at that slot in the table is a JS function which says "instantiate the module for $foo, insert it into slot 3 for future indirect calls, and then call $foo".

You need to create a separate JS function for each Wasm function you transform in this way as the "initial" table slot value, but I'd expect this could be done purely on the JS side (e.g. by having a generic function that takes a Wasm function name as its first argument, and bind-ing that parameter for each choice you need). I don't see where you need the extra information from separate static imports. Is the point that a static index in the table can't be determined ahead-of-time, so there needs to be additional dispatching/indirection? Or am I overlooking another complication of the Wasm->Wasm translation step?

conrad-watt avatar Jul 08 '24 12:07 conrad-watt

Right, each placeholder is logically a separate function. In practice we use a Proxy, but bind-ing an index parameter to create separate functions would work, too.

If the JS "manually" inserted the placeholders into the table, that would be the end of the story, but we instead use active segments to set up the initial table to hold the placeholders, which requires that they are imported. (Incidentally, we also use active segments in the secondary module to automatically patch the table on instantiation, so the JS code never needs to touch the table at all.)

tlively avatar Jul 08 '24 15:07 tlively

Ah, ok - it makes sense to need separate imports if active segments are being used. Has any testing been done about whether performing the initial table setup in JS would be competitive with this? IIUC you would save the 100k imports!

conrad-watt avatar Jul 08 '24 21:07 conrad-watt

No, we haven't tested this. At minimum, we would need some scheme for telling the JS which table indices need placeholders. It would be a shame to give up the property that all the JS has to do is instantiate the modules and the rest is taken care of automatically.

tlively avatar Jul 08 '24 22:07 tlively

At the CG meeting today we had unanimous consensus to raise the Web implementation limits on the number of imports and exports to one million.

tlively avatar Jul 16 '24 17:07 tlively

Hi @tlively ... In the meeting there was mention of likely performance implications, though it didn't sound like anything anyone was too alarmed about. The exact nature of the cost I wasn't sure (entry creation?). Would this be a fix cost that every application pays or was that talk of a cost that is would only effects applications that need such a large number of imports (exports).

jlb6740 avatar Jul 16 '24 17:07 jlb6740

No, the performance concerns were about the normal, linear work to read and process the imports and exports. Modules that have more imports and exports than were previously possible will take more time to process them than was previously necessary, but there's nothing unexpected or superlinear going on. Nothing changes for existing modules.

tlively avatar Jul 16 '24 17:07 tlively

As commented, the CG decided to raise the Web implementation limits on the number of imports and exports to one million. https://github.com/WebAssembly/spec/pull/1766 is tracking updating the spec.

See also https://github.com/WebAssembly/design/issues/1225 for discussion of optimizing the import section encoding, and the proposal repo.

sunfishcode avatar Dec 09 '24 16:12 sunfishcode