iree issues

[compilation][cpu]: failed to legalize operation onnx.Multinomial

17

### What happened? for the given IR ```mlir module { func.func @"torch-jit-export"( %arg6: !torch.vtensor) -> (!torch.vtensor) attributes {torch.onnx_meta.ir_version = 6 : si64, torch.onnx_meta.opset_version = 21 : si64, torch.onnx_meta.producer_name = "pytorch",...

pdhirajkumarprasad

bug 🐞

hal/cpu

integrations/onnx

[GPU] Clustered Subgroup Reduction

4

### Request description # Motivation A pattern we notice in flash attention kernels is: ``` A: tensor B: tensor C: tensor D : tensor = matmul(A, B, C) E :...

Groverkss

enhancement ➕

codegen

onboarding/codegen

[Codegen][CPU] Change AArch64 matmul tile sizes to (6, 16, 1)

4

This reduces the default AArch64 matmul tile sizes from (8, 16, 1) to (6, 16, 1). Originally, (8, 16, 1) was chosen to attempt to use all available vector registers...

MacDue

benchmarks:comp-stats

benchmarks:android-cpu

[PkgCI] Remove attention TD pipeline to test attn C++ pipeline.

3

raikonenfnu

[LLVMGPU][ROCm] Plumb through packed MAD support

### Request description The MAD are similar to FMA instructions and perform multiplication and addition within the same instruction. gfx942 supports a packed version: `V_PK_MAD_I16` and `V_PK_MAD_U16` that should allow...

kuhar

enhancement ➕

codegen/hip

onboarding/codegen

[Flow] Don't fuse with truncate ops with consumer

To ensure truncate ops get fused with their producers, don't fuse them with their consumer.

IanWood1

[compiler] VM translation of Module to bytecode crashes

2

### What happened? I got a segfault in `mlir::iree_compiler::IREE::VM::translateModuleToBytecode`. Here is a [vm-translate-to-bytecode-crash.zip](https://github.com/user-attachments/files/16678033/vm-translate-to-bytecode-crash.zip). ### Steps to reproduce your issue Use `compile.sh` in the ZIP. ### What component(s) does this issue...

sogartar

bug 🐞

[EmitC] Adapt to lvalue type

Changes needed to integrate https://github.com/llvm/llvm-project/pull/91475-

simon-camp

[Codegen][GPU] Add pass to reuse shared memory buffers in simple cases

1

This PR adds a new pass that tries to reuse shared memory allocations in functions. This pass only does a very basic analysis, assuming no control flow operations (and is...

Max191

Rename the compiler ROCM target to HIP

1

We rewrote the `rocm` runtime hal impl side to `hip`. The corresponding compiler target backend is still called `rocm`. To be consistent and avoid confusion, let's rename the compiler target...

antiagainst

cleanup 🧹

codegen/hip

onboarding/codegen

iree
iree copied to clipboard

Metadata

[compilation][cpu]: failed to legalize operation onnx.Multinomial

[GPU] Clustered Subgroup Reduction

[Codegen][CPU] Change AArch64 matmul tile sizes to (6, 16, 1)

[PkgCI] Remove attention TD pipeline to test attn C++ pipeline.

[LLVMGPU][ROCm] Plumb through packed MAD support

[Flow] Don't fuse with truncate ops with consumer

[compiler] VM translation of Module to bytecode crashes

[EmitC] Adapt to lvalue type

[Codegen][GPU] Add pass to reuse shared memory buffers in simple cases

Rename the compiler ROCM target to HIP

← Metadata

Owner

Metadata

iree iree copied to clipboard

Metadata

← Metadata

Owner

Metadata

iree
iree copied to clipboard