tiramisu
tiramisu copied to clipboard
Does Tiramisu have legality check of scheduling?
I tried the following example in Tiramisu master (c40b0040efa09c055b582dc116a78d14bfcd99d5).
#include <tiramisu/tiramisu.h>
using namespace tiramisu;
int main(int argc, char** argv) {
tiramisu::init("test_after");
static const int N = 48;
static const int M = 48;
var x("x", 0, N);
var y("y", 0, M);
computation f("f", {y, x}, x + y);
computation g("g", {y, x}, f(y, 48 - 1 - x));
if (std::getenv("AFTER")) g.after(f, x);
else g.after(f, computation::root);
buffer buf_f("buf_f", {M, N}, p_int32, a_temporary);
buffer buf_g("buf_g", {M, N}, p_int32, a_output);
f.store_in(&buf_f, {y, x});
g.store_in(&buf_g, {y, x});
tiramisu::codegen({&buf_g}, "test_after.o");
return 0;
}
I expected that this will cause an error when "AFTER=" is enabled because the scheduling g.after(f, x)
breaks dependency between f and g. However, it can be compiled and returns a broken result.
The Tiramisu paper "Tiramisu: A Code Optimization Framework for High Performance Systems" says that "TIRAMISU does not have
this restriction since it checks transformation legality using
dependence analysis [18].", but I cannot find the way to check "transformation legality".
How can we check the legality of transformation?
Hi @iasakura ,
The master version of Tiramisu does not have dependence analysis implemented. One of our contributors had dependence analysis implemented but did not contribute his code.
We are interested in having this feature though. If you are interested in helping that is very appreciated. Implementing legality check in Tiramisu is easy: we use the ISL library which already computes all dependences and returns a graph of dependences. What we need to do on the Tiramisu is two things: (1) extract all the information necessary and that will be passed to ISL to compute the dependences; and (2) implement a legality check that check if the dependences after applying the schedule is still preserved (this means computing the dependences then applying the schedule and computing whether the new schedule breaks the dependences which amounts to computing the difference between execution dates). All of this is easy to implement and I can assist you about each of these steps. So if you are interested in helping let me know please.
Hi @rbaghdadi,
Thank you for replying! I am very interested in legality check of tiramisu. So I will try to implement legality check. The following is my implementation idea. (For convenience, I use islpy syntax for the explanation. I will port it to C isl for Tiramisu.)
(1) Build the dataflow dependence relation D from layer 1 IR. D is a binary relation (isl map) on iteration space such that (i -> j) \in D iff the iteration j reads the values computed in the iteration i. It can be easily built from the read relation R: i -> j \in R iff the iteration j reads from i (expr::get_access
looks useful to implement read relation.)
(2) Check legality of time scheduling. Given scheduling function S, we can check the legality of time scheduling S by checking D \subset <_S where <_S is a scheduling relation satisfying i <_S j iff S(i) < S(j) (lexicographical order).
(3) Check legality check of memory scheduling. Given a write function W, which is given by computation::store_in
, we compute a conflict set [1]. Conflict set is a set of pair of array indices whose elements need to live simultaneously. We check that no pair in conflict set are mapped to the same values by W. The conflict set can be computed as follows.
(i) Compute the last use map L. L maps an iteration i to its last use iteration. It can be computed by
L = D.range_map().apply_range(S).lexmax().domain().unwrap()
.
(ii) Compute the conflict set C : { i -> j : S(i) < S(L(j)) and S(j) < S(L(i)) and i != j }. It can be computed by the following code
# C1 = { i1 -> i2 : S(i1) < S(L(i2)) }
C1 = S.lex_lt_union_map(L.apply_range(S))
# C2 = { i1 -> i2 : S(i2) < S(L(i1)) }
# = C.reverse()
C2 = C1.reverse()
# CS = { i1 -> i2 : i1 < L(i2) && i2 < L(i1) }
# = { i1 -> i2 : i1 < L(i2) } \cap { i1 -> i2 : i2 < L(i1) }
# = C1 \cap C2
CS = C1.intersect(C2).subtract(I.identity())
Finally, we can check the validity of W by checking emptiness of the set { i -> j : i -> j \in CS and W(i) = W(j) }. It can be checked by the folliwng code.
# pairs of iterations which writes to same indices
same_idx=W.domain_product(W).domain().unwrap()
# { i1 -> i2 : i1 -> i2 \in CS, WW(i1) == WW(i2) }
invalid=same_idx.intersect(CS)
ok=invalid.is_empty()
The bellow is a simple example checking legality for iterative matrix-vector multiplication using double buffering.
import islpy as isl
# for (t)
# for (j)
# for (i)
# prod[t%2][i]= prod[t%2][i] + A[i][j] * prod[(t-1)%2][j]
# Tiramisu code:
# computation prod("prod" {t, i, j}, p_float32);
# prod.set_expression(prod(t, i, j - 1) + A(i, j) * prod(t - 1, j, N - 1));
# prod.interchange(i, j)
# prod.store_in("prod_buf", {t % 2, i, 0})
# Iteration domain
I = isl.Set('[T, N] -> {prod[t, i, j] : 0 <= t < T and 0 <= i < N and 0 <= j < N}')
# Read relation
R = isl.Map('[T, N] -> {prod[t, i, j] -> prod[t, i, j-1] : t > 0 and j > 0; prod[t, i, j] -> prod[t-1, j, N-1] : t > 0 and j > 0}').intersect_domain(I)
# Default write relation
Wdefault = isl.Map('[T, N] -> {prod[t, i, j] -> prod[t, i, j]}').intersect_domain(I)
# User specified write function
W = isl.Map('[T, N, M] -> {prod[t, i, j] -> prod_buf[t % 2, i, 0]}').intersect_domain(I)
# Scheduling function & relation
S = isl.Map('[T, N] -> {prod[t, i, j] -> [t, j, 0]}')
O = S.lex_lt_union_map(S)
# Dependency relation
D = Wdefault.apply_range(R.reverse())
# Check legality of time scheduling
is_time_sched_ok = D.is_subset(O)
print(is_time_sched_ok)
# Last use function
# {(i->j)->k : i->j \in D and j -> k \in S }.lexmax()
L = D.range_map().apply_range(S).lexmax().domain().unwrap()
# Conflict set CS = C1 \cap C2
# C1 = { i1 -> i2 : S(i1) < S(L(i2)) }
C1 = S.lex_lt_union_map(L.apply_range(S))
# C2 = { i1 -> i2 : S(i2) < S(L(i1)) }
# = C.reverse()
C2 = C1.reverse()
# CS = { i1 -> i2 : i1 < L(i2) && i2 < L(i1) }
# = { i1 -> i2 : i1 < L(i2) } \cap { i1 -> i2 : i2 < L(i1) }
# = C1 \cap C2
CS = C1.intersect(C2).subtract(I.identity())
# pairs of iterations which writes to same indices
# { i1 -> i2 : i1 -> i2, W(i1) == W(i2) }
same_idx=W.domain_product(W).domain().unwrap()
# Conflicting pair which write to the same index
invalid=same_idx.intersect(CS)
# Check legality of memory scheduling
is_memory_sched_ok=invalid.is_empty()
print(is_memory_sched_ok)
I will implement the above algorithm as a function::is_legal()
. I'm not familiar with Tiramisu internal yet. So I'm happy if you teach me Tiramisu functions and data structures useful for implement above legality check.
[1] Someshekaracharya Bhaskaracharya, Uday Bondhugula, and Albert Cohen. SMO: An integrated approach to intra-array and inter-array storage optimization. POPL 2016
Hi @iasakura ,
This looks very interesting. Thanks! Let me look at this in detail and get back to you.
Hi @rbaghdadi ,
I implemented PoC version of legality check of scheduling. Here is my branch:
https://github.com/iasakura/tiramisu/tree/support-legality-check-of-scheduling
I implemented legality check as function::check_legality
. Typically, it is expected to be called from function::codegen
by enabling the last parameter check_legality
added by me.
The code is not ready for PR, I need more tests and refactoring. But, it can check legality of some simple samples (see tests/test_{175,176,177}.cpp).
I will continue to develop and send PR when it will be ready. Please let me know if you have any comments.
Hi @iasakura , thanks! This looks great. I'll review the code and let you know.
Here are few high level comments:
-
Can you please add a high level description of the algorithm you are using? (add it in the source code, usually we document the implementation details in the source file and the user level documentation in the header file). Ideally, you can put a high level description similar to the one you have above.
-
Do you have any references for steps 2 and 3 in the algorithm? Adding references would be very useful.
-
There is a function called "compute_dep_graph()" in Tiramisu currently. This function only computes data flow dependences (true depenences) and does not compute memory dependences. It also does not work if you have updates or reductions. Can you have a look at that one and suggest whether you should update it (to provide a fully functional dependence analysis, not just legality check). We better have a step that computes dependences properly (ISL has a function that takes the accesses, the schedules, ... and returns the isl_union_map of dependences). Then we can use these dependences to check for the legality.
The other things seem tp be good for me! Thanks for the nice implementation!
Hi @rbaghdadi ,
Thank you for your nice comments!
- Can you please add a high level description of the algorithm you are using?
- Do you have any references for steps 2 and 3 in the algorithm? Adding references would be very useful.
Sure! I will add comments soon.
- There is a function called "compute_dep_graph()" in Tiramisu currently. This function only computes data flow dependences (true depenences) and does not compute memory dependences. It also does not work if you have updates or reductions. Can you have a look at that one and suggest whether you should update it (to provide a fully functional dependence analysis, not just legality check). We better have a step that computes dependences properly (ISL has a function that takes the accesses, the schedules, ... and returns the isl_union_map of dependences). Then we can use these dependences to check for the legality.
Oh, I didn't find compute_dep_graph()
. I will use it for my legality check, Thanks.
However, my current implementation checks that time and memory scheduling (layer II and III) don't violate the dataflow dependency defined at layer I. Because layer I doesn't seem to introduce false dependency, fully functional dependence analysis computing all dependency looks unnecessary for implementing legality check (But if other analysis need it, I'm ready to implement it.)
Also, I don't understand why current compute_dep_graph()
is broken when we have updates or reductions. At least to compute dataflow dependences, it looks correct. Do you have any example?
Also, I want to add more tests for legality check. If you have good example, please let me know.
Thanks. Yes good point. But actually Layer I can introduce false dependences because it supports updates. The user can declare a computation and then update it (i.e., erase the old value and put a new value). This is the case of reductions or simply in-place computations.
There is no need to do dependence analysis at Layer I though, it is better to do it in Layer III once you know how computations are stored in memory. So I'm Ok with your suggestions. It would just be good to expose the result of dependence analysis to Tiramisu users because such result would be useful (for example for computing live-in and live-out variables or for bound inference).
Depending on how much time you have to work on this, feel free to do just what you suggested above or to actually do dependence analysis in Layer III (by exploiting the ISL dependence analysis call).