cakeml icon indicating copy to clipboard operation
cakeml copied to clipboard

Lambda lifting as a source-to-source pass

Open myreen opened this issue 7 years ago • 3 comments

The CakeML compiler currently lacks a lambda lifting optimisation. Such an optimisation pass could be beneficial because it makes functions closed (not have free variables) which allows optimisations such as clos_call to run.

When lambda lifting has been discussed, the implicit assumption has been that it should be done in ClosLang. However, implementing it in ClosLang is made awkward by ClosLang's use of dB indices for variables (one would have to shift indices), it would have to remap the globals (the compiler would maintain an injective function), and ClosLang's max_app might get in the way when adding new arguments to functions.

I propose that lambda lifting is to be done as a source-to-source transformation, because:

  • The explicit names of the source language allows the optimisation to avoid the need to shift names in dB style.

  • One avoids the complication of ClosLang's multi-argument functions and max_app.

  • By doing this before globals are introduced (in source_to_flat), we leave source_to_flat as the only place that introduces globals.

  • A source-to-source optimisation is easy to test and debug before verification starts because one can use process_topdecs to generate test cases and one can use astPP to produce output that is readable.

Example

For this issue to be considered finished. There needs to be a compiler phase that can (at the very least) optimise functions such as the following:

fun isort leq xs = 
  let
    fun insert x ys =
      case ys of 
        [] => [x] 
      | (y::ys) => if leq x y then x::y::ys else y::insert x ys
    fun insert_all xs ys =
      case xs of
        [] => ys
      | (x::xs) => insert_all xs (insert x ys)
  in insert_all xs [] end;

The optimisation should produce something along the lines of the following.

fun auto_gen_insert leq x ys =
  case ys of 
    [] => [x] 
  | (y::ys) => if leq x y then x::y::ys else y::auto_gen_insert leq x ys;

fun auto_gen_insert_all leq xs ys =
  case xs of
    [] => ys
  | (x::xs) => auto_gen_insert_all leq xs (auto_gen_insert leq x ys);

fun isort leq xs = 
  let
    val insert = auto_gen_insert leq
    val insert_all = auto_gen_insert_all leq
  in auto_gen_insert_all leq xs [] end;

Note that insert and insert_all are still defined in isort just to make the lambda-lifting optimisation easy to verify (not too smart). One could remove these useless bindings in a separate optimisation pass that only targets such unused bindings of partially applied functions.

This issue could be the basis of a 5-month Masters project or similar.

myreen avatar Mar 13 '18 07:03 myreen

This optimisation ought to be implemented on the type+module-update branch in order to future-proof the development.

myreen avatar Mar 13 '18 07:03 myreen

Chatting with @zapashcanon, doing this as a flatLang -> flatLang pass has the advantage that the global variables are obviously separate from the locals. This is nice, because the Lambda lifting algorithms start by making all of the names unique, and we don't want to mess with the names of globals. Also, for the proofs, the closure environments only contain local variables.

Introducing more globals shouldn't be difficult in a flat->flat pass.

SOwens avatar Apr 13 '18 13:04 SOwens

@myreen says this could be good to do as a stateless source-to-source pass.

It is quite difficult to do well --- one needs to be careful with how eagerly the optimization is applied.

tanyongkiam avatar Dec 09 '24 12:12 tanyongkiam