component-model icon indicating copy to clipboard operation
component-model copied to clipboard

Encoding result-arity for start functions?

Open peterhuene opened this issue 3 years ago • 4 comments

I had mentioned this informally while talking to Luke previously, but it would be nice for the binary encoding to have the expected result-arity encoded directly in the start function now that we're returning multiple results.

The reason for this is that tools that don't validate or otherwise track the complex type information needed to validate components can't easily determine how many values are added to the value index space as the consequence of encountering a start function.

The only way it can do so is to track all type information of the component, which is nontrivial thanks to aliasing (e.g. instance exports). Some tools also do not want to validate the component, allowing them to work on malformed components the best they can, e.g. a binary-to-text tool that can print malformed binary encodings to make a bug easier to spot.

I propose the following change to the start rule in the binary encoding:

start ::= f:<funcidx> arg*:vec(<valueidx>) r:<u32> => (start f (value arg)*)
                                           ^------

With validation requirement:

`r` must equal the number of results of the type of function `f`

Thoughts on this?

peterhuene avatar Aug 09 '22 21:08 peterhuene

For a concrete example of where this would be desired, wasmprinter is a tool that prints the textual format of a binary component.

It doesn't keep any type information around while printing the component other than index counts (namely for looking up a name in a name section based on index while printing).

With the current information encoded for a start function, we can easily print:

 (start $f (value $v1) (value $v2) ...)

But can't print the results despite knowing how many values are currently in the value index space. Tracking the types would be a lot of complexity for such a simple tool.

peterhuene avatar Aug 09 '22 21:08 peterhuene

That sounds reasonable to me. The number of arguments is already present in the arg*:vec(<valueidx>) encoding, rather than being inferred from the signature of $f, so this makes returns consistent with that.

sunfishcode avatar Aug 09 '22 21:08 sunfishcode

(sorry for delay; back from vacation) That makes sense to me too. You're welcome to make the PR for this, otherwise I'll get to it as soon as I flush a few things out of my queue.

lukewagner avatar Aug 15 '22 22:08 lukewagner

Added in #92, with the additional idea that perhaps we should have the text format also have an explicit arity.

lukewagner avatar Aug 16 '22 01:08 lukewagner