iree
iree copied to clipboard
A retargetable MLIR-based machine learning compiler and runtime toolkit.
### What happened? Error log: ``` (turb.env) PS C:\Users\eagarvey\SHARK\SHARK-Turbine> iree-compile --iree-input-type=torch --iree-vm-bytecode-module-output-format=flatbuffer-binary --iree-hal-target-backends=rocm --mlir-print-debuginfo=false --mlir-print-op-on-diagnostic=false --iree-hal-target-backends=rocm --iree-hip-target=gfx1103 --iree-vm-bytecode-module-output-format=flatbuffer-binary .\sd3_mmdit_gfx1103_dps\dispatch_27_attn.mlir .\sd3_mmdit_gfx1103_dps\dispatch_27_attn.mlir:2:3: error: failed to run translation of source executable to target...
Splat constants are the non-canonical form for codegen, we instead prefer fills for 2 reasons: 1. Consistency with dynamic shapes 2. Fills are tilable and compose well with tile +...
## How I think about buffer allocation in data-tiling 1. The default path is now materializing encodings at very early stage (i.e., GlobalOpt), while we want to build the late...
If any implicit argument is used LLVM will reserve 256 bytes of kernarg space and emit metadata requiring the runtime to populate all implicit arguments. The only way to control...
Currently the [upstream pass for loop invariant code motion](https://github.com/llvm/llvm-project/blob/7f1b465c6ae476e59dc90652d58fc648932d23b1/mlir/lib/Transforms/LoopInvariantCodeMotion.cpp#L47) performs hoisting on all loops independent of loop type or loop bounds. This has two issues: 1. This allows hoisting out...
The `Convert1x1FilterConvToMatmul` pass currently fails when there is a non-unit batch N dimension. In such cases, the transformation is still possible, and the N dimension should be folded into the...
For this elementwise + pad dispatch ``` func.func @main(%8 : tensor, %9 : tensor) -> tensor { %c0_f16 = arith.constant 0.0 : f16 %13 = tensor.empty() : tensor %14 =...
### Request description In this issue, https://github.com/nod-ai/SHARK-Platform/issues/264 I encountered an error message that looked like ``` ValueError: :0: NOT_FOUND; HAL device `__device_0` not found or unavailable: #hal.device.target; ``` It would...