cakeml icon indicating copy to clipboard operation
cakeml copied to clipboard

Make cutsets smaller in DataLang

Open myreen opened this issue 1 year ago • 1 comments

@tanyongkiam noticed that cutsets are sometimes too large.

For example, the explorer output reveals:

         (seq
            ...
            (6 := Mult (5 4) (some {4,5}))
            (return 6))

Here 4 and 5 are in the cutset ({4,5}) only because they are used as arguments to Mult. Ideally, the cutset should be empty because 4 and 5 are not used afterMult.

The reason they are in the cutset can be seen in the semantics of DataLang, here:

https://github.com/CakeML/cakeml/blob/48a421d7a393f35b150221d8dec6d3caca9befe6/compiler/backend/semantics/dataSemScript.sml#L1146-L1153

This needs to be corrected so that the arguments are read (get_vars) before the cutset is applied (cut_state_opt). With this change data_live can make the cutsets smaller. The data_to_word proofs need updating.

myreen avatar Aug 09 '24 18:08 myreen

As an addon here, the bvi-to-data compilation sometimes produces a sub-optimal sequence when computing CBV arguments to functions.

For example, this tail call:

(func parse_vb_string@3246 (a)
   (call parse_vb_string_aux@3240 (op (Cons 0))
      (op (Const 0)) (op (Const 1))
      (let
         (b <- (op Add (op LengthByte (var a)) (op (Const ~1))))
         (if (op Less (op (Const 0)) (var b)) (op (Const 0)) (var b)))
      (op (Const 0)) (var a)))

becomes:

(func parse_vb_string@3246 (0)
   (seq
      (1 := (Cons 0) () none)
      (2 := (Const 0) () none)
      (3 := (Const 1) () none)
      (4 := LengthByte (0) none)
      (5 := (Const ~1) () none)
      (6 := Add (5 4) (some {0,1,2,3,4,5}))
      (7 := (Const 0) () none)
      (8 := Less (6 7) (some {0,1,2,3,6,7}))
      (if 8
         (seq
            (9 := (Const 0) () none)
            (10 := 9))
         (10 := 6))
      (11 := (Const 0) () none)
      (call none parse_vb_string_aux@3240 (1 2 3 10 11 0) none)))

Notice that the constants stored in variables 1,2, 3 are unnecessarily stored into cutsets, when they should really be computed much later.

After some discussion with @myreen , we should at least try to do constant propagation before data_live.

Another related optimization is to modify the order in which arguments are computed in bvi-to-data, so that the ones that result in cutsets are computed first.

tanyongkiam avatar Nov 27 '24 10:11 tanyongkiam