clangir
clangir copied to clipboard
Lowering through MLIR standard dialects: class, struct, arrays and other issues
I have started lowering the class & struct product types down to MLIR standard dialects by relying on the fundamental MLIR product type (built-in tuple) and that works up to the point that they meet some memory. After some debug, it looks like memref and tuple cannot compose. 🤯 😢
Some past discussion context: https://discourse.llvm.org/t/why-cant-i-have-memref-tuple-i32-i32/1853/6 https://discourse.llvm.org/t/memref-type-and-data-layout/2116
I thought there would be some kind of support for data layout through MemRefElementTypeInterface but this work only user-defined types. 😦
This is a basic example where we need standard dialect support for basic higher-level language support as suggested by @joker-eph in a presentation at the last LLVM Dev Mtg.
It is unclear how to move on to integrate better CIR with MLIR standard dialects.
Some possible actions I am thinking of:
- someone has a brilliant idea 😄
- give up and focus on direct CIR → LLVMIR translation;
- just focus on having CIR to work well with MLIR standard transformations and analysis;
- translate any struct to an int of the struct size and generate memory cast operations and bit-field extraction/insertion all over the place to emulate the tuple access;
- add better support for
tuplein MLIR; - introduce a new memory-friendly MLIR tuple-like type;
- keep
cir.structjust for that; - go directly to
llvm.structjust for that. How does it work with Flang since F90 introduced derived data types?
FWIW, our goal for the cir dialect is to eventually get closer in functionality to the MLIR dialects. e.g. to put it simply we would heuristically like something similar to a cir.affine at some point. This would line up with your #2 and #3 if I understand correctly.
I like @ChuanqiXu9 idea of throughMLIR starting of on top of LLVM dialect and getting the pieces moved incrementally to standard dialect. Seems like all the points on improving MLIR with tuple and whatnots are good path too and can be done in parallel.
translate any struct to an int of the struct size and generate memory cast operations and bit-field extraction/insertion all over the place to emulate the tuple access;
If this allows you to make progress, I'd go for it as well!
How does it work with Flang since F90 introduced derived data types?
There was an interesting experiment report as part of https://sc24.conference-program.com/presentation/?id=ws_llvmf103&sess=sess754, where the author played with FIR to standard dialects (versus direct to LLVM, which seems to be the default?). I bet he might be able to provide you some extra insights.
My current experiments are done in https://github.com/llvm/clangir/pull/1334. It sounds even more complicated since arrays currently work by chance and a lot of things are not working, like pointers on arrays which are removed...
The presentation from LLVM Dev Meeting 2024 https://www.youtube.com/watch?v=Bt__BDQivxo "Making upstream MLIR more friendly to programming languages: current upstream limitations, the ptr dialect, and the road ahead" Speaker: Mehdi Amini, Fabian Mora Cordero
hi @keryell, thank you for your work on the lowering process through MLIR.
As mentioned in https://github.com/llvm/clangir/issues/1411 , do you have any plans to support built-in functions (e.g., printf)? Alternatively, could you provide some suggestions on how to implement support for these functions within the ThroughMLIR path?
@PikachuHyA implementing printf is not on my top-priority since my goal is not to run Hello World 😄 but rather to program useful things on some weird but interesting piece of hardware.
But other contributors are welcome.
There is some progress upstream for supportinmg arbitrary vector types, as introduced by https://discourse.llvm.org/t/rfc-allow-pointers-as-element-type-of-vector/85360/43, https://discourse.llvm.org/t/rfc-allow-arbitrary-vector-element-types/85545 and implemented by https://github.com/llvm/llvm-project/pull/133455.
Now I need to lobby @matthias-springer for a new TensorTypeElementInterface. 😄
Now I need to lobby @matthias-springer for a new TensorTypeElementInterface. 😄
Last I checked, you should be able to plug any of your type as the Tensor Element type already: https://github.com/llvm/llvm-project/blob/308654608cb8bc5bbd5d4b3779cb7d92920dd6b7/mlir/lib/IR/BuiltinTypes.cpp#L314
What would you expect from a TensorTypeElementInterface ?
Now I need to lobby @matthias-springer for a new TensorTypeElementInterface. 😄
Last I checked, you should be able to plug any of your type as the Tensor Element type already: llvm/llvm-project@
3086546/mlir/lib/IR/BuiltinTypes.cpp#L314
In CIR lowering I lower C arrays with value semantics (for example in a C struct) to tensor and pointers or C arrays with reference semantics (usual case for array in C) to memref.
If I want for example a C array of pointers with value semantics, I need a tensor of memref which is not allowed since it is not possible to have a tensor of a builtin type for some reason.
If at some point tuple makes some progress into the MLIR ecosystem up to the point of being able to represent C++ struct/union/class, I will have some tensor of tuple too, which is forbidden today.
What would you expect from a TensorTypeElementInterface ?
A way to express the fact I want a tensor of something, even if the something is currently disabled today.
There is some work with other PR going down the path of Polygeist use of lower-level LLVM dialect or just use bytes everywhere:
- https://github.com/llvm/clangir/pull/1565 which goes low level along the item translate any struct to an int of the struct size and generate memory cast operations and bit-field extraction/insertion all over the place to emulate the tuple access presented at the top of this PR.
- https://github.com/llvm/clangir/pull/1539 and https://github.com/llvm/clangir/pull/1561 to use LLVM IR dialect as MLIR standard dialect the way Polygeist does, which goes along the item go directly to llvm.struct just for that presented at the top of this issue.
This has the advantage to work today but with the problem of having low-level LLVM dialect in the middle of MLIR higher-level standard dialect, making more difficult to target other back-ends such as SPIR-V or various hardware accelerators not relying on the LLVM back-end directly but on their own dialects. This would require raising some LLVM dialect constructs back to something higher-level.