mlscript icon indicating copy to clipboard operation
mlscript copied to clipboard

Tail Recursion Optimization

Open CAG2Mark opened this issue 1 year ago • 7 comments

Adds a pass to the IR which optimizes mutually tail-recursive and mutually tail-recursive modulo cons functions.

The full pass first optimizes modulo cons functions to be tail recursive, then optimizes those functions to be tail recursive.

This PR consists of the following features

IR Field Assignment

Adds an assignment primitive to the IR. Given x: CtorApp and val: TrvialExpr, we now have the primitive

x.field = val in ...

Also modified the interpreter to no longer assume references to constructors are referentially transparent.

Tail Recursion Modulo Cons

Extends the idea in this paper to optimize strongly connected components of tail recursive functions, of which some calls may be within a constructor (guarded recursion).

Supports both modulo-cons calls and regular tail calls within the same strongly connected component.

Rewrite mutually tail recursive functions

Suppose we have functions:

def f(n) = e1
def g(m) = e2

where e1 and e2 tail-call f or g. This function will be rewritten a single function:

def f_g(branch, n, m) =
  let join j(branch, m, n) = 
    if branch == 0 then e1'
    else e2'
  j(branch, m, n)

def f(n) = f_g(0, n, dummy)
def g(n) = f_g(1, dummy, m)

where e1' and e2' are e1 and e2 with their tail calls to f and g replaced with jumps to j.

In general, strongly connected components of mutually tail recursive functions will be optimized in this way. All join points used in each strongly connected component will be rewritten and merged into the same join point.

CAG2Mark avatar Mar 29 '24 08:03 CAG2Mark

TODO:

  • [x] Make sure variable names in the stack frame do not clash
  • [x] Make sure the merged function's name does not clash with other functions
  • [x] Substitute dummy variables (may need extension to the IR). See here
  • [x] Propagage @tailrec annotations to the IR.

CAG2Mark avatar Mar 29 '24 08:03 CAG2Mark

Everything is mostly done. There's just a few things left to check, such as making sure function and parameter names don't clash

CAG2Mark avatar May 01 '24 14:05 CAG2Mark

I can't seem to get the checks to become green, since I created a testcases which purposefully errors and the order of items in the stacktrace is not always the same :/

CAG2Mark avatar May 01 '24 14:05 CAG2Mark

If the contents of the stack trace are not essential to the test, you could try only displaying the stack trace if a debug flag is present.

HarrisL2 avatar May 01 '24 15:05 HarrisL2

Yes, stack traces should never be part of the test output. By the way, an exception being thrown means the compiler has a bug. If there's a program error to report, it should be reported properly.

LPTK avatar May 02 '24 00:05 LPTK

Instead of writing class True and class False in every single test case, why not make it a built-in class info? It should be a trivial change and would remove much clutter. Could also do the same with Cons/Nil and Some/None.

LPTK avatar May 13 '24 02:05 LPTK

Steps before this can be an official PR:

  • Address the remaining TODOs or phrase them better so other people can address them later
  • Make a proper error reporting system that points to the problem
  • Make sure to document the nontrivial aspects/invariants of the approach
  • Make sure no reordering of computations can occur; you may use a purity check to determine if something can be reordered, whose current implementation can be very primitive (it will be improved later)
  • Document the semantics of @tailrec when ascribing calls and definitions We should in fact have both:
    • @tailcall to annotate calls, with the current semantics assigned to @tailrec
    • ~~@tailrec-annotated calls that ensure the recursion back to the current function is tail-recursive~~ from the meeting discussion: just drop these for now and only accept @tailrec on definitions

LPTK avatar May 13 '24 07:05 LPTK

Completed:

  • Improved error reporting and propagation of locations.
    • Added compilation errors to the diff tests. Note that this changes the output of some old tests.
    • Also, postProcess now takes a raise: Diagnostic -> Unit parameter, so extensions using postProcess can now raise errors in the same way as tests in the main project.

TODO:

  • Purity check
  • Remaining TODOs and unresolved comments
  • Document semantics and non-trivial aspects

After this it should be good to merge.

CAG2Mark avatar Jun 01 '24 15:06 CAG2Mark

Instead of writing class True and class False in every single test case, why not make it a built-in class info? It should be a trivial change and would remove much clutter. Could also do the same with Cons/Nil and Some/None.

resolved

CAG2Mark avatar Jun 05 '24 13:06 CAG2Mark

OK, should be good to merge. There's just one trivial merge conflict in DiffTests.scala to handle

CAG2Mark avatar Jun 10 '24 08:06 CAG2Mark

OK, should be good to merge.

Great!

There's just one trivial merge conflict in DiffTests.scala to handle

So... do you intend to ever fix these conflicts? 🤔

LPTK avatar Jun 13 '24 09:06 LPTK

So... do you intend to ever fix these conflicts? 🤔

I figured doing it now will make the diff a bit harder to review. Once everything else is sorted, I'll fix the conflicts.

CAG2Mark avatar Jun 13 '24 09:06 CAG2Mark

Go ahead

CAG2Mark avatar Jun 19 '24 05:06 CAG2Mark