[GPU][DT] Materialize encodings for GPU targets
IMO, the pass would look at IREE::GPU::MmaInterfaceAttr attribute to enumerate tile sizes and set up encoding configs for each role. It gives us the set of supported intrinsics. E.g.,
https://github.com/iree-org/iree/blob/1fcb89d36e1bc0d8c7cc2818b1025f182bfc4a75/compiler/src/iree/compiler/Codegen/LLVMGPU/KernelConfig.cpp#L389-L399
I think as an initial step, we can set inner tile sizes to intrinsic sizes and increase them (for better unrolling) later. As discussed today, let's use linalg.generic to represent the mmt4d-like op for now.
Putting an additional resource here before I forget it. We can review it together when we're discussing details.
Here is an upstream method which has very helpful logics. Ideally, we should refactor the method, and create a new method to return the packed generic op. We can't use the method directly in the materialization pass because it also generates pack/unpack ops. What we need is getting the linalg op in the materialization pattern; replace the contraction op with it.
https://github.com/llvm/llvm-project/blob/4b75fcf0a50f4be955b611e8e20d84d90ea133c8/mlir/include/mlir/Dialect/Linalg/Transforms/Transforms.h#L1120-L1130
Use the issue because it already has some context. E.g., the upstream method that we can use in the work. In the recent LLVMGPUTileAndFuse pipeline (pipeline_test), we already generates some pack/unpack ops in matmul codegen. This is what we want to do in GPU data-tiling.
The goal now is moving those packs to the materialization of set/unset encodings.