mlir Problem when lower-to-llvm, when load/store a memref with its layout map

Hello,

I am trying to lower the following code to llvm:

func @load_store() {
^bb0:
// CHECK: %0 = alloc() : memref<1024x64xf32, 1>
%0 = alloc() : memref<1024x64xf32, (d0, d1) -> (d0, d1), 1>

%1 = constant 0 : index
%2 = constant 1 : index

// CHECK: %1 = load %0[%c0, %c1] : memref<1024x64xf32, 1>
%3 = load %0[%1, %2] : memref<1024x64xf32, (d0, d1) -> (d0, d1), 1>

// CHECK: store %1, %0[%c0, %c1] : memref<1024x64xf32, 1>
store %3, %0[%1, %2] : memref<1024x64xf32, (d0, d1) -> (d0, d1), 1>

return
}

I am using command: mlir-opt $INPUT_FILE --lower-affine --lower-to-llvm

But I got the following error message:

./mytmptest/memref_internal.mlir:10:8: error: 'llvm.getelementptr' op operand #0 must be LLVM dialect type
%3 = load %0[%1, %2] : memref<1024x64xf32, (d0, d1) -> (d0, d1), 1>
^

Could you please help to figure out the correct way for generating llvm code for the same or similar mlir code?

Thanks, Rui

Jul 25 '19 19:07 yurikamome

As mentioned in the doc, there is currently no support for non-default memory spaces or non-trivial assignments in the conversion to the LLVM dialect. (d0,d1) -> (d0,d1) is actually trivial, but the memory space is a problem.

What's your use case for the memory spaces?

You are welcome to contribute a patch to fix this, if interested -- I can point you to the locations int he code where extra work is needed.

Jul 29 '19 08:07 ftynse

Hello,

Thanks for the reply. My object is actually to find out how does MLIR generate LLVM code for non-trivial layout map and non-default memory space within a Memref.

If I am not misunderstanding the word “memory space”, we need that to be caches in cpu. We want to copy data from memory to cache in a different layout efficiently.

I am glad to contribute some patches. I am also cc’ing my advisors for discussions.

Thanks, Rui

On Jul 29, 2019, at 2:52 AM, ftynse [email protected] wrote:

As mentioned in the doc https://github.com/tensorflow/mlir/blob/master/g3doc/ConversionToLLVMDialect.md#memref-descriptor, there is currently no support for non-default memory spaces or non-trivial assignments in the conversion to the LLVM dialect. (d0,d1) -> (d0,d1) is actually trivial, but the memory space is a problem.

What's your use case for the memory spaces?

You are welcome to contribute a patch to fix this, if interested -- I can point you to the locations int he code where extra work is needed.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/tensorflow/mlir/issues/53?email_source=notifications&email_token=ABKLE434CSEJWJATXT4ECY3QB2VVNA5CNFSM4IG55E72YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3ABPEY#issuecomment-515905427, or mute the thread https://github.com/notifications/unsubscribe-auth/ABKLE47CPTQC6GLACPFQ6YDQB2VVNANCNFSM4IG55E7Q.

Jul 29 '19 17:07 yurikamome

On Mon, Jul 29, 2019 at 10:22 AM yurikamome [email protected] wrote:

Hello,

Thanks for the reply. My object is actually to find out how does MLIR generate LLVM code for non-trivial layout map and non-default memory space within a Memref.

If I am not misunderstanding the word “memory space”, we need that to be caches in cpu. We want to copy data from memory to cache in a different layout efficiently.

Actually it is fairly target specific, but in general I've seen "memory space" mostly used to refer to physically/logically separated memory, as in two pointers with the same value but different memory space would not load from the same memory. One example in MLIR could be a separation between the host memory and a discrete GPU, or two different processes that don't share the same virtual address space.

Jul 30 '19 04:07 joker-eph

Thanks for the reply. My object is actually to find out how does MLIR generate LLVM code for non-trivial layout map and non-default memory space within a Memref.

Well, it does not :) It's actually pretty easy to do. Just replace any load/store from a memref with non-trivial layout by affine.apply of the layout map to access subscripts, and use the result of affine.apply as new access subscrips treating memref as if it had an identity layout.

If I am not misunderstanding the word “memory space”, we need that to be caches in cpu. We want to copy data from memory to cache in a different layout efficiently.

I don't think memory spaces were meant for cpu (physical) caches, because you cannot control these caches explicitly. If by cache you mean just a local buffer where you would store a part of your data with a different layout, I don't see a reason why it should be in a different memory space. It's still in main memory.

Jul 30 '19 13:07 ftynse

Thanks for the reply. My object is actually to find out how does MLIR generate LLVM code for non-trivial layout map and non-default memory space within a Memref.

Well, it does not :) It's actually pretty easy to do. Just replace any load/store from a memref with non-trivial layout by affine.apply of the layout map to access subscripts, and use the result of affine.apply as new access subscrips treating memref as if it had an identity layout.

In addition, you will also need a utility because of the following reason. Changing the layout map of the memref (to identity layout) changes the 'type' of the memref. An SSA value's type, once created, can't be changed - you have to just replace the defining operation with a new one that creates the new SSA value of the new type and replace all uses of the previous one with this. The same is the situation with something as simple as changing the memory space of a memref for example. It's thus useful to have a utility that does this (it's straightforward) - note that the utility can return failure if that memref escapes (is passed to a function call) or is received as an argument, because in such cases, one would have to do the replacement globally, updating function signatures -- this is more involved and MLIR currently doesn't have the utilities/API for it AFAIK.

If I am not misunderstanding the word “memory space”, we need that to be caches in cpu. We want to copy data from memory to cache in a different layout efficiently.

I don't think memory spaces were meant for cpu (physical) caches, because you cannot control these caches explicitly. If by cache you mean just a local buffer where you would store a part of your data with a different layout, I don't see a reason why it should be in a different memory space. It's still in main memory.

+1 Memory spaces aren't meant to correspond to hardware caches. If you are packing into buffers that are going to be hardware cache resident, one would just expect the default memory space ('0') to be used for those memrefs.

Aug 04 '19 05:08 bondhugula

Thanks for the reply. My object is actually to find out how does MLIR generate LLVM code for non-trivial layout map and non-default memory space within a Memref.

Well, it does not :) It's actually pretty easy to do. Just replace any load/store from a memref with non-trivial layout by affine.apply of the layout map to access subscripts, and use the result of affine.apply as new access subscrips treating memref as if it had an identity layout.

In addition, you will also need a utility because of the following reason. Changing the layout map of the memref (to identity layout) changes the 'type' of the memref. An SSA value's type, once created, can't be changed - you have to just replace the defining operation with a new one that creates the new SSA value of the new type and replace all uses of the previous one with this. The same is the situation with something as simple as changing the memory space of a memref for example. It's thus useful to have a utility that does this (it's straightforward) - note that the utility can return failure if that memref escapes (is passed to a function call) or is received as an argument, because in such cases, one would have to do the replacement globally, updating function signatures -- this is more involved and MLIR currently doesn't have the utilities/API for it AFAIK.

This is now done in PR #104

Aug 26 '19 15:08 bondhugula

With PR #104, if you run -simplify-affine-maps, all the non-trivial layout maps are converted to ones with trivial ones (subject to certain conditions - it's only intra-procedural and there should be no escapes).

Aug 26 '19 18:08 bondhugula