tail-call icon indicating copy to clipboard operation
tail-call copied to clipboard

Expand motivation section of overview

Open eqrion opened this issue 3 years ago • 7 comments

Some feedback I've heard about this proposal is that the motivation is unclear for why this has to be an operation in core wasm.

Specifically I've had it asked, "why can't this be done by compilers targeting wasm"? I don't think the motivation section goes into this at all, and I think that would be useful.

The best answer I've been able to give is that this is not possible for return_call_indirect, which is basically an indirect jump and has no close analogue in core wasm. I'm not as knowledgeable though about the general case of a functional language wanting to compile to wasm and the feasibility of performing tail call elimination in a compiler. I think for posterity it would be nice to have something to point to on this.

eqrion avatar Dec 12 '22 17:12 eqrion

It's also not possible to emulate return_call in core Wasm. Perhaps the person asking was missing that tail calls are more general than just tail recursion?

rossberg avatar Dec 12 '22 18:12 rossberg

It's also not possible to emulate return_call in core Wasm. Perhaps the person asking was missing that tail calls are more general than just tail recursion?

I'll just talk for myself because I'm not super knowledgeable here.

Understood that tail recursion is different than general tail calls, and that tail recursion is more amenable to compiler transformations. Are there no feasible compiler transformations for general direct tail calls? That's probably the detail I am missing.

eqrion avatar Dec 12 '22 18:12 eqrion

Implementing tail calls was issue #1 for the JVM, back before there was issue trackers. Never got fixed though. As it happens, if you have reliable exception handling, you can fake tail calls: you have to be able to intercept stack overflow as an exception. But much better to have it built in.

fgmccabe avatar Dec 12 '22 18:12 fgmccabe

There are only non-local transformations, and they are either very expensive (trampolining through exception handlers, like @fgmccabe is alluding to, provided you even have them) or only practical in simple cases (inlining the callee) or require whole program rewriting (defunctionalisation) to eliminate the calls, merging everything into very large functions (which may hit implementation limits and is incompatible with separate compilation).

rossberg avatar Dec 12 '22 19:12 rossberg

There are only non-local transformations, and they are either very expensive (trampolining through exception handlers, like @fgmccabe is alluding to, provided you even have them) or only practical in simple cases (inlining the callee) or require whole program rewriting (defunctionalisation) to eliminate the calls, merging everything into very large functions (which may hit implementation limits and is incompatible with separate compilation).

Okay, that is the detail I was missing. I think a paragraph about this in the overview would be helpful, it's definitely one of the FAQs I get about wasm.

eqrion avatar Dec 12 '22 20:12 eqrion

More background for @eqrion (adds nothing new to this issue, but provides a little more context):

In 2018, when I was working on compiling a functional language to wasm, I asked @TerrorJack for advice on tail calls and about what they did in https://github.com/tweag/asterius (Haskell to wasm compiler). His response was:

About tail call optimization:

  • GHC does not use the C stack to pass parameters. It has its own notion of "stack". Likewise, asterius uses a region on the linear memory as its own "stack".

  • When "calling" a function, we don't simply use the "call" instruction and pass parameters as wasm function parameters. Instead, we pass parameters either on the stack or by some virtual registers (which can simply be implemented as wasm globals), then we find a way to jump to the new function without growing wasm call stack.

  • Since we don't use wasm function parameters at all, we can assume all wasm type signatures of our functions are i32(). The returned i32 indicates the call target (all calls become tail calls). Then we can use a trampolined mini-interpreter which given the first function to enter, repeatedly invoke the function, perform some side-effects, retrieve the next call target, then use the call_indirect instruction to call into the next target. This is the current approach of asterius.

  • You can consider doing a CPS-styled compilation to make everything tail-called in order to use the trampolined tail call method.

  • You may go even further by marshalling each function to a wasm block, collecting all "functions" into one wasm function, and instead of using call_indirect, use the br_table instruction to jump to a block. When a block finishes its job, it sets an i32 wasm local which is the "call target". It may have higher performance than the approach above, but when the resulting wasm function becomes too large you may find it slow to validate (be it binaryen validation or V8's own validation).

I also contacted @msprotz who responded:

Tail-calls: yes, if you care about mutually recursive functions, then definitely, a relooper-like algorithm will be needed, or perhaps a trampoline? At least that's what js_of_ocaml used, last I checked.

Indeed, I would love for the tail-calls proposal in Wasm to get some traction.

lorenzleutgeb avatar Dec 13 '22 11:12 lorenzleutgeb

I also contacted @msprotz who responded:

Tail-calls: yes, if you care about mutually recursive functions, then definitely, a relooper-like algorithm will be needed, or perhaps a trampoline? At least that's what js_of_ocaml used, last I checked.

As it stands, Js_of_ocaml is able to translate mutually-recursive functions into loops, but it can not do the same for recursive calls to dynamically-determined functions, like continuation parameters. This means that libraries like angstrom (which use continuations for running parser-combinators) to have separate code-paths for js_of_ocaml that implement a trampoline.

TyOverby avatar Dec 13 '22 18:12 TyOverby