clangir Lowering through MLIR standard dialects: class, struct, arrays and other issues

I have started lowering the class & struct product types down to MLIR standard dialects by relying on the fundamental MLIR product type (built-in tuple) and that works up to the point that they meet some memory. After some debug, it looks like memref and tuple cannot compose. 🤯 😢 Some past discussion context: https://discourse.llvm.org/t/why-cant-i-have-memref-tuple-i32-i32/1853/6 https://discourse.llvm.org/t/memref-type-and-data-layout/2116 I thought there would be some kind of support for data layout through MemRefElementTypeInterface but this work only user-defined types. 😦 This is a basic example where we need standard dialect support for basic higher-level language support as suggested by @joker-eph in a presentation at the last LLVM Dev Mtg. It is unclear how to move on to integrate better CIR with MLIR standard dialects. Some possible actions I am thinking of:

someone has a brilliant idea 😄
give up and focus on direct CIR → LLVMIR translation;
just focus on having CIR to work well with MLIR standard transformations and analysis;
translate any struct to an int of the struct size and generate memory cast operations and bit-field extraction/insertion all over the place to emulate the tuple access;
add better support for tuple in MLIR;
introduce a new memory-friendly MLIR tuple-like type;
keep cir.struct just for that;
go directly to llvm.struct just for that. How does it work with Flang since F90 introduced derived data types?

Dec 10 '24 01:12 keryell

FWIW, our goal for the cir dialect is to eventually get closer in functionality to the MLIR dialects. e.g. to put it simply we would heuristically like something similar to a cir.affine at some point. This would line up with your #2 and #3 if I understand correctly.

Dec 10 '24 22:12 lanza

I like @ChuanqiXu9 idea of throughMLIR starting of on top of LLVM dialect and getting the pieces moved incrementally to standard dialect. Seems like all the points on improving MLIR with tuple and whatnots are good path too and can be done in parallel.

translate any struct to an int of the struct size and generate memory cast operations and bit-field extraction/insertion all over the place to emulate the tuple access;

If this allows you to make progress, I'd go for it as well!

How does it work with Flang since F90 introduced derived data types?

There was an interesting experiment report as part of https://sc24.conference-program.com/presentation/?id=ws_llvmf103&sess=sess754, where the author played with FIR to standard dialects (versus direct to LLVM, which seems to be the default?). I bet he might be able to provide you some extra insights.

Dec 12 '24 21:12 bcardosolopes

My current experiments are done in https://github.com/llvm/clangir/pull/1334. It sounds even more complicated since arrays currently work by chance and a lot of things are not working, like pointers on arrays which are removed...

Feb 22 '25 01:02 keryell

The presentation from LLVM Dev Meeting 2024 https://www.youtube.com/watch?v=Bt__BDQivxo "Making upstream MLIR more friendly to programming languages: current upstream limitations, the ptr dialect, and the road ahead" Speaker: Mehdi Amini, Fabian Mora Cordero

Feb 28 '25 03:02 keryell

hi @keryell, thank you for your work on the lowering process through MLIR.

As mentioned in https://github.com/llvm/clangir/issues/1411 , do you have any plans to support built-in functions (e.g., printf)? Alternatively, could you provide some suggestions on how to implement support for these functions within the ThroughMLIR path?

Mar 03 '25 12:03 PikachuHyA

@PikachuHyA implementing printf is not on my top-priority since my goal is not to run Hello World 😄 but rather to program useful things on some weird but interesting piece of hardware. But other contributors are welcome.

Mar 06 '25 03:03 keryell

There is some progress upstream for supportinmg arbitrary vector types, as introduced by https://discourse.llvm.org/t/rfc-allow-pointers-as-element-type-of-vector/85360/43, https://discourse.llvm.org/t/rfc-allow-arbitrary-vector-element-types/85545 and implemented by https://github.com/llvm/llvm-project/pull/133455. Now I need to lobby @matthias-springer for a new TensorTypeElementInterface. 😄

Apr 08 '25 01:04 keryell

Now I need to lobby @matthias-springer for a new TensorTypeElementInterface. 😄

Last I checked, you should be able to plug any of your type as the Tensor Element type already: https://github.com/llvm/llvm-project/blob/308654608cb8bc5bbd5d4b3779cb7d92920dd6b7/mlir/lib/IR/BuiltinTypes.cpp#L314

What would you expect from a TensorTypeElementInterface ?

Apr 08 '25 08:04 joker-eph

Now I need to lobby @matthias-springer for a new TensorTypeElementInterface. 😄

Last I checked, you should be able to plug any of your type as the Tensor Element type already: llvm/llvm-project@3086546/mlir/lib/IR/BuiltinTypes.cpp#L314

In CIR lowering I lower C arrays with value semantics (for example in a C struct) to tensor and pointers or C arrays with reference semantics (usual case for array in C) to memref. If I want for example a C array of pointers with value semantics, I need a tensor of memref which is not allowed since it is not possible to have a tensor of a builtin type for some reason.

If at some point tuple makes some progress into the MLIR ecosystem up to the point of being able to represent C++ struct/union/class, I will have some tensor of tuple too, which is forbidden today.

What would you expect from a TensorTypeElementInterface ?

A way to express the fact I want a tensor of something, even if the something is currently disabled today.

Apr 08 '25 19:04 keryell

There is some work with other PR going down the path of Polygeist use of lower-level LLVM dialect or just use bytes everywhere:

https://github.com/llvm/clangir/pull/1565 which goes low level along the item translate any struct to an int of the struct size and generate memory cast operations and bit-field extraction/insertion all over the place to emulate the tuple access presented at the top of this PR.
https://github.com/llvm/clangir/pull/1539 and https://github.com/llvm/clangir/pull/1561 to use LLVM IR dialect as MLIR standard dialect the way Polygeist does, which goes along the item go directly to llvm.struct just for that presented at the top of this issue.

This has the advantage to work today but with the problem of having low-level LLVM dialect in the middle of MLIR higher-level standard dialect, making more difficult to target other back-ends such as SPIR-V or various hardware accelerators not relying on the LLVM back-end directly but on their own dialects. This would require raising some LLVM dialect constructs back to something higher-level.

Apr 17 '25 22:04 keryell

clangir clangir copied to clipboard

Lowering through MLIR standard dialects: class, struct, arrays and other issues

clangir
clangir copied to clipboard