LLM-VM icon indicating copy to clipboard operation
LLM-VM copied to clipboard

basic prototype Linear algebra compiler

Open Jobhdez opened this issue 1 year ago • 0 comments

Hello you all,

In response to some comments made by @mmirman I started working on a basic linear algebra compiler prototype.

Please comment if this is in the right direction.

I started the implementation of a simple prototype but I think it captures the structure of a linear algebra compiler.

Instead of the grammar that I defined we can use Einstein notation.

We need to talk more about the grammar, what operations are we supporting?

The structure of this compiler is as follows. The initial AST is converted to an intermediate language lalg which makes loops explicit.

Making loops explicit will help as we expand the operations because we can apply loop optimizations such as loop unrolling, and loop fusion.

I did not implement a sparse intermediate language but I can work on this next as I am expanding the operations.

Finally the intermediate representation lalg gets lowered to C code.

The reason why I decided to do this is as follows.

We can either lower the c code, with loop optimizations, to vectorized code ie SIMD; moreover, we could lower linear algebra operations to CUDA as @mmirman had suggested.

If we dont decide to go this way, we can lower the linear algebra to assembly and we can use a low level intermediate language that looks like C.

Please let me know your comments and how I can improve it.

The final version would look something like this.

Using PyTorch’s torch.jit.trace we can get access to the the computational graph of a given open source language model.

Once we have the computational graph we convert this into a graph based intermediate language.

We then apply machine independent optimizations to this high level IR and lower it to a low level IR to which we apply machine dependent optimizations.

I just read a paper and the authors generated C code that was on par with the tvm llvm backend.

So we can get good performance if we apply the right optimizations.

We can separate the linear algebra computations from the schedules.

Schedules include things like tiling, vectorization, loop unrolling.

Thanks

Here is an example of how it can be used so far. I just added matrix plus matrix, and vector plus vector for now but I am going to add more operations and will add the corresponding c code.

>>> from ast_to_lalg import ast_to_lalg
>>> from parser import parser 
>>> from lalg_to_c import lalg_to_c
>>> ast = parser.parse("[[3 4 5][4 5 6]] + [[4 5 6] [5 6 7]]")

>>> ast2 = ast_to_lalg(ast)
>>> ast2
(LalgForLoop2D (exps: (LalgExps [(LalgMatrix [[3, 4, 5], [4, 5, 6]]), (LalgMatrix [[4, 5, 6], [5, 6, 7]])])) (n: (LalgInt 2)) (inner_n: (LalgInt 3)) (i: (LalgInt 0)) (j: (LalgInt 0)) (op:(LalgOp +)))

>>> lalg_to_c(ast2)
'int matrix1[2,3] = {{3, 4, 5}, {4, 5, 6}};\n    int matrix2[2, 3] = {{4, 5, 6}, {5, 6, 7}};\n    \n    matrix *mat = initialize_matrix(matrix1, 2, 3);\n    \n    matrix *mat2 = initialize_matrix(matrix2, 2, 3);\n\n    add_matrices(mat, mat2);'


>>> ast5 = parser.parse("[3 4 5] + [4 5 6]")
>>> ast6 = ast_to_lalg(ast5)
>>> lalg_to_c(ast6)
'int vec[] = {3, 4, 5};\n    int vec2[] = {3, 4, 5};\n\n    vector *v = initialize_vector(vec, 3);\n\n    vector *v2 = initialize_vector(vec2, 3);\n\n    add_vectors(v, v2);'

Jobhdez avatar Oct 14 '23 02:10 Jobhdez