cakeml icon indicating copy to clipboard operation
cakeml copied to clipboard

Efficient peep-hole optimisation for DataLang

Open myreen opened this issue 5 years ago • 0 comments

Peep-hole optimisation is an interesting optimisation that the CakeML compiler lacks at present. This issue is about adding a peep-hole optimiser to the CakeML compiler, in particular, to the DataLang layer.

Example

The following is an example that would be good to cover. A user might phrase a case expression as follows in the source:

case (a,b) of 
  (Some x, Some y) => Some (x + y)
| _ => None

A good peep hole optimiser should make sure the pair (a,b) is never allocated in the generated code.

At the level of DataLang, the program from above might look something like the following:

t5 := cons 0 [var a; var b]
t6 := el 0 [var t5]
t7 := el 1 [var t5]
if taglen_eq t6 "Some" andalso taglen_eq t7 "Some" then ...

A successful peep-hole optimiser should produce something similar to this:

t5 := cons 0 [var a; var b]
t6 := var a
t7 := var b
if taglen_eq t6 "Some" andalso taglen_eq t7 "Some" then ...

and a deadcode elimination pass would remove the assignment to t5 (which is the allocation of the pair (a,b)).

Implementation Ideas

I think the implementation should make use of the fact that DataLang programs are almost in SSA form. It should use arrays and destructive updates for efficiency.

Here's a possible sketch of an implementation. For each function:

Step 1: Populate an array with the expression that is assigned to each variable that appears in the program. Variables are numbers so they can be used to index the array. At index i, the array will have:

  • NoAssign if variable i isn't used in the program
  • Exp e if there is only one assignment to i and that assignment is e
  • ManyAssigns if there are more than one assignment to i

Step 2: Search the array for expressions that can be improved. For the example above, it should notice that the entry t7

array[t5] = Exp (cons 0 [var a; var b])
array[t7] = Exp (el 1 [var t5])

matches a rewrite

∀n. el 1 [cons _ [_ ; n]] --> n

which means that it is to update the array entry to the improved:

array[t7] = Exp (var b)

A similar rewrite would improve entry t6.

Step 3: Make a pass over the original program replacing each assignment in the program with the expression (Exp) carried in the array.

MSc Project

This is issue suitable as an MSc student project. Contact @myreen on the CakeML slack to discuss this in case you are interested.

myreen avatar Jul 17 '19 12:07 myreen