Muzammiluddin Syed
Muzammiluddin Syed
### Request description original issue: https://github.com/iree-org/iree/issues/20699 incomplete pr: https://github.com/iree-org/iree/pull/20827 ### What component(s) does this issue relate to? Compiler ### Additional context _No response_
### Request description See: https://github.com/iree-org/iree/pull/20468#discussion_r2126945727 ### What component(s) does this issue relate to? _No response_ ### Additional context _No response_
### Request description There is a lowering of `GPUPrintfOp`s in [upstream LLVM](https://github.com/llvm/llvm-project/blob/46f90165be92e08e059dcc07d42347cbf7446a0b/mlir/lib/Conversion/GPUCommon/GPUOpsLowering.h#L143) and it would be helpful for quality of life and debugging, if we could also make use of...
### What happened? # Context To make effective use of DPP operations available to AMD GPU's, the PR below changed the implementation of warp reduction to preserve `subgroup_reduce` ops rather...
Draft PR For initial review. Adding lowering support for RoiAlign ops with static shapes and known sampling ratios.
Context: https://github.com/iree-org/iree/pull/22737#discussion_r2577897836 During the schedule selection process there are various places where we still do not adequately support scaled intrinsics. - The selection of the A.I cutoff points for gemms...
See for context: https://github.com/iree-org/iree/pull/22775#pullrequestreview-3515972644 There exists a lot of CMake code within our codebase which can benefit from bazel-to-cmake (even if all our CMake code is not completely generatable via...
Context: https://github.com/iree-org/iree/pull/22763#discussion_r2565789172 A clean up is required to enable to use of `DenseI32ArrayAttr` where possible (`IREECompilerDialectsModule.cpp`).
Use amdsharktuner to collect performance data on the effect of knobs such as workgroup thread count, subgroup count, tile size, etc. on the best performance at various shapes of interest....