Polygeist icon indicating copy to clipboard operation
Polygeist copied to clipboard

`SubToSubView` canonicalization disabled

Open mortbopet opened this issue 3 years ago • 9 comments

Noticed that the SubToSubView canonicalization was disabled in a recent commit:

https://github.com/wsmoses/Polygeist/blob/main/lib/polygeist/Ops.cpp#L681

What was the reason for this? If it is interfering with the application order of the other canonicalization patterns, should it instead be moved to a separate pass? I'm asking because for my use-case (and i presume in general) it's fairly important that the Polygeist dialect operations have been lowered to that available in upstream MLIR.

mortbopet avatar Jan 05 '22 16:01 mortbopet

I was actually about to poke you regarding that. There was an incompatibility with a recent LLVM rebase that I hadn't tracked down. Would you be take a look at re-enabling?

wsmoses avatar Jan 05 '22 16:01 wsmoses

hmm... simply re-enabling the canonicalization seems to work fine for my use cases :thinking:..

mortbopet avatar Jan 05 '22 17:01 mortbopet

Let me see if I can reproduce the issue (it has since been rebased again). If I recall, the issue was that the subview op would produce a different memref type (e.g. one with an affine map), which would be illegal to replace the original result type with.

wsmoses avatar Jan 05 '22 19:01 wsmoses

Looks like i encountered the issue myself. polygeist-opt --canonicalize with SubToSubView enabled

module attributes {llvm.data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128", llvm.target_triple = "x86_64-unknown-linux-gnu"}  {
  func @main() -> i32 attributes {llvm.linkage = #llvm.linkage<external>} {
    %c0 = arith.constant 0 : index
    %c10 = arith.constant 10 : index
    %c1 = arith.constant 1 : index
    %c0_i32 = arith.constant 0 : i32
    %c10_i32 = arith.constant 10 : i32
    %c100_i32 = arith.constant 100 : i32
    %c100 = arith.constant 100 : index
    %0 = memref.alloca() : memref<10xi32>
    %1 = memref.alloca() : memref<10x100xi32>
    scf.for %arg0 = %c0 to %c10 step %c1 {
      memref.store %c100_i32, %0[%arg0] : memref<10xi32>
      scf.for %arg1 = %c0 to %c100 step %c1 {
        %2 = arith.index_cast %arg1 : index to i32
        %3 = arith.remsi %2, %c10_i32 : i32
        memref.store %3, %1[%arg0, %arg1] : memref<10x100xi32>
      }
    }
    scf.for %arg0 = %c0 to %c10 step %c1 {
      %2 = "polygeist.subindex"(%1, %arg0) : (memref<10x100xi32>, index) -> memref<100xi32>
      %3 = memref.load %0[%arg0] : memref<10xi32>
      %4 = call @if_loop_1(%2, %3) : (memref<100xi32>, i32) -> i32
    }
    return %c0_i32 : i32
  }
  func private @if_loop_1(memref<100xi32>, i32) -> i32 attributes {llvm.linkage = #llvm.linkage<external>}
}

yields

error: expected result type to be 'memref<1x100xi32, affine_map<(d0, d1)[s0] -> (d0 * 100 + s0 + d1)>>' or a rank-reduced version. (mismatch of result layout) 
      %2 = "polygeist.subindex"(%1, %arg0) : (memref<10x100xi32>, index) -> memref<100xi32>

The issue seems to be that the following is no longer considered valid:

      %2 = memref.subview %1[%arg0, 0] [1, 100] [1, 1] : memref<10x100xi32> to memref<100xi32>

Looks like memref<100xi32> is no longer a valid output type for that operation... Tried to look a bit around and change the internals of SubToSubIndex but so far unsuccessful.

mortbopet avatar Jan 06 '22 13:01 mortbopet

This smells like something from a thread I remember from last year. Specifically this comment: https://llvm.discourse.group/t/memref-subview-affine-map-and-symbols/4851/9.

This is the patch in question: https://github.com/llvm/llvm-project/commit/d69f5e197c529e7f87d47edb11c7135decee1d88.

The error message here (mismatch of result layout) seems to hint that we hit the new condition.

I don't have any idea what the appropriate solution is offhand, but hopefully this helps.

mikeurbach avatar Jan 06 '22 17:01 mikeurbach

I think the issue here partially stems from the fact that to create a consistent type system, we need to have offset operations that maintain the same type. It looks like (upstream) subview does not maintain this property, and thus why we created subindex.

I had a chat with @ftynse yesterday about this and I think we came to the conclusion that subindex is technically distinct from various upstream view-like operations. We (along with @chelini and @kumasento) had a discussion about upstreaming the op a while ago, but ended up working on other things and not pushing that through.

@mortbopet what precisely is your requirement behind not using this non-upstream op. Would your problem be mitigated by telling your downstream tool about the op or potentially would you have the cycles to look into upstreaming it (we even have lowering to LLVM/canonicalizations/etc that would be nice if they could be made upstream)?

wsmoses avatar Jan 06 '22 17:01 wsmoses

@wsmoses it's driven from a desire that i have to run various other opt tools on the generated IR (i.e. mlir-opt, which do not link against the Polygeist dialect libraries) to perform transformations. I considered running the Polygeist-to-llvm pass, but the issue here is then that this pass converts multiple dialect at once (memref, std, arith), to LLVM, rather than solely Polygeist - which then conflicts with these transformations that i'd want to run, which work on non-LLVM IR.

mortbopet avatar Jan 06 '22 18:01 mortbopet

Ah, what I meant by the conversion to LLVM is that presumably upstream wouldn't take a new op unless it could be lowered to LLVM (which happily this can be) -- not that you should do the lowering of that op first.

It then sounds like you can either run polygeist-opt (which should contain various base MLIR transforms -- but now with polygeist registered as a dialect), or perhaps look into upstreaming the op?

wsmoses avatar Jan 06 '22 18:01 wsmoses

I see... I'll just investigate whether i can get my flow to work by enabling --allow-unregistered-dialect on the various other opt tools and then pass it by polygeist-opt at the end to lower things to LLVM - which i nevertheless need, the goal is to generate something that mlir-cpu-runner eventually can execute, however, i have these few transformation passes that need to get applied beforehand.

mortbopet avatar Jan 06 '22 18:01 mortbopet