ocannl icon indicating copy to clipboard operation
ocannl copied to clipboard

Rename `Assignments.to_low_level` to `reference_compile`, and introduce `cpu_friendly_compile` (later also `cuda_friendly`)

Open lukstafi opened this issue 2 years ago • 1 comments

Implement as many optimizations as reasonable from these posts:

The optimization to start with, is reordering the iteration (i.e. nesting of the resulting for loops), for example to maximize the lexicographic preference: number of arrays where the rightmost axis has the innermost iterator, where the rightmost axis has the next-to-innermost iterator, the next-to-rightmost axis has the innerpost iterator, the next-to-rightmost axis has the next-to-innerpost iterator, ...

Some optimizations will require knowing the properties of the Ops.binary_op (and Ops.unary_op) involved, e.g. associativity, commutativity, distributivity (one op distributes over another). The properties actually needed should be represented directly in Ops.

lukstafi avatar Oct 01 '23 19:10 lukstafi

With new nomenclature, reference_lower, cpu_friendly_lower, cuda_friendly_lower.

lukstafi avatar Jul 15 '24 21:07 lukstafi

I don't think this is the right approach right now. There will be generic optimizations: reordering loop nesting for data locality, tiling. They can start with an already lowered representation. If it indeed would turn out it's easier to do it in one pass, just pass a config to to_low_level.

lukstafi avatar Sep 20 '24 09:09 lukstafi