Benoit Jacob
Benoit Jacob
`mmt4d`->consumer fusion is a major codegen optimization opportunity. The typical consumers include the `unpack` following the `mmt4d`, and element-wise ops (generics with only `parallel` iterators) that are typically found as...
Progress on https://github.com/iree-org/iree/issues/18561. This PR supersedes https://github.com/iree-org/iree/pull/18587. It introduces a new command line flag, `--iree-llvmcpu-logging-unspecified-target-cpu`, with three values: * Empty string (the default) preserves the current behavior of silently falling...
This replaces some constants what were hardcoded in GPUMaterializeEncoding.cpp by actual GPU target parameters. The logic in `getSwizzle` was doing wonky things with its own local `const int targetPreferredLoadBitWidth =...
The current code had its own list of MFMA intrinsics that we can use, then checked that against the target. Flipping this around, we can simply query the list from...
In https://github.com/iree-org/iree/pull/18839 we are introducing 3 new fields to `TargetGpuAttr`: `max_load_instruction_bits`, `simds_per_wgp`, `vgpr_space_bits` on all GPUs. For now they only are populated for CDNA3. They should be populated for other...
The current tile-selection heuristic in GPUMaterializeEncoding only ever expands to subgroups in the N dimension, never in the M dimension. That allows to keep this logic a little simpler, but...
The tile size selection heuristic in GPUMaterializeEncoding is focused on the generic case of non-narrow shapes; then at the end, a fix-up is applied to adjust to narrow shapes. This...
https://github.com/llvm/llvm-project/pull/100667 renamed a header, so this adapts the `#include`. I need to cherry-pick this commit in IREE as we are integrating these llvm-project changes. You will need to apply this...
This part of the spec, https://gpuweb.github.io/gpuweb/wgsl/#differences-from-ieee754 > Finite Math Assumption: > * [Overflow](https://gpuweb.github.io/gpuweb/wgsl/#ieee754-overflow), infinities, and NaNs generated before [shader execution](https://gpuweb.github.io/gpuweb/wgsl/#shader-execution-start) [will](https://gpuweb.github.io/gpuweb/wgsl/#behavioral-requirements) generate errors. Is very hard to satisfy for compilers...
This implements the fix suggested by @heshuju in https://github.com/llvm/torch-mlir/issues/4108. It fixes an issue that was blocking the LLVM integrate in IREE. https://github.com/iree-org/iree/actions/runs/15626565095/job/44156258415?pr=21092