multi-value Text format of function calls with multi-value arguments

Is there a proposal for extending the notation for function calls to account for the possibility that an argument call is 'supplying' multiple arguments to the call.

E.g., imagine an allocator that returns two values: the base address of an allocated string (say), and the size of the string. And suppose that such an allocated value needs to be passed to a function that needs those two values as different arguments.

The signatures of utf8-count and munge-text might be:

(function $utf8-count (i32) (param i32 i32) ...) (function $munge-text (i32 i32) (...) ...)

respectively. The form:

(utf8-count (munge-text ...))

looks confusing because normally each argument of a function call would be denoted by a separate expression.

One suggestion: use the LISP . ' cons operator'; as in:

(utf8-count . (munge-text ...))

although this suffers from the potential issue that . in LISP cannot be used more than once in a form.

Jul 16 '19 19:07 fgmccabe

Alternatives from other languages:

* (prefix) for splat
.. (postfix) for expand

Jul 16 '19 19:07 KronicDeth

The following works perfectly fine in the text format:

(func $f (result i32 i32) ...)
(func $g (param i32 i32) ...)
(func $h
  (call $g (call $f))
)

Jul 16 '19 19:07 rossberg

That is a matter of opinion. Personally, I find it confusing.

On Tue, Jul 16, 2019 at 12:51 PM Andreas Rossberg [email protected] wrote:

The following works perfectly fine in the text format:

(func $f (result i32 i32) ...) (func $g (param i32 i32) ...) (func $h (call $g (call $f)) )

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/WebAssembly/multi-value/issues/18?email_source=notifications&email_token=AAQAXUGZ7ZXW33KGFNGAHY3P7YRFHA5CNFSM4IEGT6M2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2B6NOI#issuecomment-511960761, or mute the thread https://github.com/notifications/unsubscribe-auth/AAQAXUH4MXJJMRCYGZ3S3ITP7YRFHANCNFSM4IEGT6MQ .

-- Francis McCabe SWE

Jul 16 '19 19:07 fgmccabe

In my mind I was modeling multi-value results as tuples and therefore they would need to be opted into being splatted into separate values, such as here. If they automatically splat, how do you store the multi-value result in a single variable? Or is that not a thing?

Jul 16 '19 19:07 KronicDeth

A follow up that may explain why it's confusing:

in most languages, including LISP, a term of the form:

(t1 ... tn)

is interpreted as the sequence of elements ti: t1:t2: ... :nil

The 'other' way is effectively a multi-cat: t1+t2+ ... +tn

On Tue, Jul 16, 2019 at 12:55 PM Francis McCabe [email protected] wrote:

That is a matter of opinion. Personally, I find it confusing.

On Tue, Jul 16, 2019 at 12:51 PM Andreas Rossberg < [email protected]> wrote:

The following works perfectly fine in the text format:

(func $f (result i32 i32) ...) (func $g (param i32 i32) ...) (func $h (call $g (call $f)) )

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/WebAssembly/multi-value/issues/18?email_source=notifications&email_token=AAQAXUGZ7ZXW33KGFNGAHY3P7YRFHA5CNFSM4IEGT6M2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2B6NOI#issuecomment-511960761, or mute the thread https://github.com/notifications/unsubscribe-auth/AAQAXUH4MXJJMRCYGZ3S3ITP7YRFHANCNFSM4IEGT6MQ .

-- Francis McCabe SWE

-- Francis McCabe SWE

Jul 16 '19 19:07 fgmccabe

@KronicDeth, multivalue results cannot be stored in a single value. Think of it as multiple individual values being separately pushed onto the stack one after another. So they're not really tuples.

Jul 16 '19 20:07 tlively

@KronicDeth, tuples do not exist as first-class values in Wasm. It is a stack machine.

OTOH, the folded form of the text format seems perfectly in line with how languages with first-class tuples would write the analogous.

SML:

fun f() = (1, 2)
fun g(x, y) = x + y
val h = g(f())

Haskell:

f () = (1, 2)
g (x, y) = x + y
h = g (f ())

Ocaml:

let f () = (1, 2)
let g (x, y) = x + y
let h = g (f ())

etc.

The need for splatting in some languages is merely a consequence of an artificial distinction between tuples and multiple arguments. In languages like the above, all functions have exactly one argument, which may be a tuple (though some prefer currying). So for example,

fun g(x, y) = x + y
val p = (1, 2)
val h = g(p)

works perfectly fine in good old SML, similarly in others.

Jul 16 '19 20:07 rossberg

This form of tuple syntax leads to real world confusion. In particular, it conflates

f(x,y)

with

f(u,v,w)

by binding x->u but y->(v,w)

Jul 16 '19 20:07 fgmccabe

@fgmccabe, not sure what you mean, inconsistent arities are simply a type error in any of these languages.

Edit: Or are you talking about Wasm? There, the folded form does not imply any check that the nesting aligns with types. But that's already the case, with or without multi values.

Jul 16 '19 20:07 rossberg

It is not actually an 'arity error'. If I define

f (x,y) = ...

I can call it with

f (u,v,w)

with the above binding. This is because in ML et al, tuples are pairs. But if you have genuine tuples, then (x,y) != (u,v,w) for any x,y,u,v,w

Jul 16 '19 20:07 fgmccabe

@fgmccabe, no you cannot, n-ary tuples are not pairs in either ML or Haskell. Nor in any other language except Lisp/Scheme, as far as I am aware.

Jul 16 '19 20:07 rossberg

More generally, this is a case of "types driving syntax" (in order to understand the syntax I need to know the types). Ergonomically, it is better to have it the other way around -- e.g., syntax directed type rules.

Jul 16 '19 20:07 fgmccabe

So, if you are right, the original question stands. How do you invoke a function of three arguments where two of the arguments are the tuple result of another function call?

Incidentally, the 'pair' interpretation of tuples is exactly how Prolog handles tuples (which is why I am so strong in my ergonomic opinion). Furthermore, the type of a tuple in ML takes the cross product notation integer * integer * string; so your assertion leaves me even more confused.

Jul 16 '19 20:07 fgmccabe

If you cannot pass through the tuple then you need to deconstruct it, typically with a pattern-matching let:

fun f() = (1, 2, 3)
fun g(x, y) = x + y
val h = let val (x, y, _) = f() in g(x, y)

Of course, you can also write a function:

fun drop3rd(x, y, _) = (x, y)
val h = g(drop3rd(f()))

Jul 16 '19 20:07 rossberg

The type T * U * V is not the same as T * (U * V). The * is not a binary type constructor but a (family of) mixfix type constructors. (In fact, it SML it is even just shorthand for the record type {1 : T, 2 : U, 3 : V})

Jul 16 '19 20:07 rossberg

The pattern-matching let does not have an analog in wasm. We need something for wasm; particularly for the x/ex/sno/host bindings surface syntax. (I.e., it is not necessarily always legal WASM but uses WASM s-expression notation)

Jul 16 '19 20:07 fgmccabe

Question: can you have a unary-tuple in SML?

Jul 16 '19 20:07 fgmccabe

The analogue in Wasm is a sequence of drop and local.set. Or literally the let block from the typed function refs proposal. ;)

For ex bindings, isn't some drop combinator all that's needed? In any case, that seems out of scope for this proposal.

Jul 16 '19 20:07 rossberg

A unary tuple can only be written in its desugared form: {1 : int}. The expression syntax (x1,...,xN) or the type t1 * ... * tN is identified with x1, resp t1, in the unary case.

Jul 16 '19 21:07 rossberg

For bindings, I would like to be able to write

(local-fun (string-to-linear S))

where string-to-linear returns a pair of numbers: the address of the string and its length.

in the binding expression.

It is quite important to be able to strongly limit the allowed forms in a binding expression; it is also fairly important that the syntax is not overly complex.

Since we are also constrained to use wasm's s-expression notation, I am looking for a way to write an extended form of s-expression to handle this common case. This is not an issue for the binary form; there we can do what we please.

Jul 16 '19 21:07 fgmccabe

I guess I don't see what's wrong with the very syntax you just gave?

Jul 16 '19 21:07 rossberg

because it's confusing. because of cases like

(local-fun (string-to-linear S1) offset)

(local-fun start (string-to-linear S1) end (string-to-linear S2))

(local-fun (first (string-to-linear S1)) 23 end (string-to-linear S2))

Understood that 'confusing' is a matter of opinion. But, my earlier comment stands: types should follow syntax, not the other way around.

Jul 16 '19 21:07 fgmccabe

Those all seem reasonable to me. Not sure in what sense syntax "follows" types here, there's simply no relation between syntactic arity and number of values. An expression can produce and consume any number of values (including 0). But that's just natural in a stack machine.

Jul 16 '19 21:07 rossberg

If you look carefully at

(local-fun (string-to-linear S1) end (string-to-linear S2))

(local-fun (first (string-to-linear S1)) 23 end (string-to-linear S2))

you will see the type of the local-fun is actually identical even though the number of arguments supplied is different. So, in order to know that the syntax is valid, (i.e., not an arity error), you have to know the types of the sub-expressions.

Which is why it is potentially confusing.

(Note: 'first' returns the first of two arguments)

Jul 16 '19 21:07 fgmccabe

@rossberg

The need for splatting in some languages is merely a consequence of an artificial distinction between tuples and multiple arguments

Some of those languages need such a distinction, though, because they have both tuples and multiple arguments. For example python does, and so it has * to make it clear when a tuple is being turned into separate arguments. Are we sure wasm will never have tuples?

Jul 16 '19 23:07 kripken

I understand the reasoning why pair != tuple when multi-value results are defined in terms of the stack machine @rossberg, so thanks for that clarification.

I think the main issue we’re having is that the s-expression syntax hides the stackiness of the architecture and that it means that the function call lexical order doesn’t match evaluation order and a lot of us are too used to languages that are single return or only multi-return through tuples.

Reminding ourselves of the mismatch between s-expressions and the stack is just something we need to do when reading the s-expression form unless we want to have splatting as an human-only syntax to remind us about the equivalent of splatting that happen on the stack.

Jul 16 '19 23:07 KronicDeth

@kripken:

The need for splatting in some languages is merely a consequence of an artificial distinction between tuples and multiple arguments Some of those languages need such a distinction, though, because they have both tuples and multiple arguments.

Um, didn't you just give a circular reason? They need it because they have it? ;)

Of course, there are good reasons to make the distinction, esp. in lower-level languages. But notably, they all have to do with implementation concerns, not with semantic necessity.

Are we sure wasm will never have tuples?

Well, a stack machine is strictly more expressive than second-class tuples, i.e., already subsumes them. As for first-class tuples, the GC proposal introduces structs, which are essentially just that. In-between the two, what's left?

Jul 17 '19 04:07 rossberg

@fgmccabe:

So, in order to know that the syntax is valid, (i.e., not an arity error), you have to know the types of the sub-expressions.

"Valid" here means well-typed, so all you're observing is that in order for this to be well-typed, you have to know the types -- but that's always true. What's actually going on is that syntactic arity does not correlate with semantic arity -- neither implies the other. Whether one finds that confusing is, as you said, subjective, or perhaps just a matter of getting used to.

Jul 17 '19 05:07 rossberg

Um, didn't you just give a circular reason?

I apologize if my wording was unclear, but basically I was saying: Python has tuples and multiple arguments, and I believe it could not manage without splat.

Well, a stack machine is strictly more expressive than second-class tuples, i.e., already subsumes them. As for first-class tuples, the GC proposal introduces structs, which are essentially just that. In-between the two, what's left?

Will a GC struct be splat-able into multiple arguments?

Jul 17 '19 13:07 kripken

Python has tuples and multiple arguments, and I believe it could not manage without splat.

No disagreement. When designed around that choice, a language needs extra constructs to convert between the two.

Will a GC struct be splat-able into multiple arguments?

There is no specific support for that, the code will just have to project the fields manually. Is there reason to believe that a built-in primitive would be more efficient?

Jul 17 '19 13:07 rossberg