iree icon indicating copy to clipboard operation
iree copied to clipboard

A retargetable MLIR-based machine learning compiler and runtime toolkit.

Results 759 iree issues
Sort by recently updated
recently updated
newest added

### What happened? Error log: ``` (turb.env) PS C:\Users\eagarvey\SHARK\SHARK-Turbine> iree-compile --iree-input-type=torch --iree-vm-bytecode-module-output-format=flatbuffer-binary --iree-hal-target-backends=rocm --mlir-print-debuginfo=false --mlir-print-op-on-diagnostic=false --iree-hal-target-backends=rocm --iree-hip-target=gfx1103 --iree-vm-bytecode-module-output-format=flatbuffer-binary .\sd3_mmdit_gfx1103_dps\dispatch_27_attn.mlir .\sd3_mmdit_gfx1103_dps\dispatch_27_attn.mlir:2:3: error: failed to run translation of source executable to target...

bug 🐞
help wanted

Splat constants are the non-canonical form for codegen, we instead prefer fills for 2 reasons: 1. Consistency with dynamic shapes 2. Fills are tilable and compose well with tile +...

benchmarks:comp-stats

## How I think about buffer allocation in data-tiling 1. The default path is now materializing encodings at very early stage (i.e., GlobalOpt), while we want to build the late...

codegen

If any implicit argument is used LLVM will reserve 256 bytes of kernarg space and emit metadata requiring the runtime to populate all implicit arguments. The only way to control...

performance ⚡
codegen/rocm

Currently the [upstream pass for loop invariant code motion](https://github.com/llvm/llvm-project/blob/7f1b465c6ae476e59dc90652d58fc648932d23b1/mlir/lib/Transforms/LoopInvariantCodeMotion.cpp#L47) performs hoisting on all loops independent of loop type or loop bounds. This has two issues: 1. This allows hoisting out...

enhancement ➕
good first issue 🌱

The `Convert1x1FilterConvToMatmul` pass currently fails when there is a non-unit batch N dimension. In such cases, the transformation is still possible, and the N dimension should be folded into the...

For this elementwise + pad dispatch ``` func.func @main(%8 : tensor, %9 : tensor) -> tensor { %c0_f16 = arith.constant 0.0 : f16 %13 = tensor.empty() : tensor %14 =...

### Request description In this issue, https://github.com/nod-ai/SHARK-Platform/issues/264 I encountered an error message that looked like ``` ValueError: :0: NOT_FOUND; HAL device `__device_0` not found or unavailable: #hal.device.target; ``` It would...

enhancement ➕