iree issues

IREE Generic Vectorizer to support `i1` pattern in conv.

2

Take this file as input: ``` func.func @conv2d_accumulate_2_32_32_32_times_3_3_64_dtype_i1_i1_i1(%lhs: tensor, %rhs: tensor, %acc: tensor) -> tensor { %result = linalg.conv_2d_nchw_fchw {dilations = dense : tensor, strides = dense : tensor} ins(%lhs,...

lialan

[Codegen] Remove --iree-llvmgpu-enable-prefetch

This flag https://github.com/iree-org/iree/blob/d834aa7357179e0d806f3634d2efe3af2fa45171/compiler/src/iree/compiler/Codegen/LLVMGPU/KernelConfig.cpp#L90 enables software prefetching for kernels using shared memory. Software prefetching is disabled by default, and only enabled by this flag. Over time, prefetching became part of GPU...

Groverkss

codegen

quality of life 😊

codegen/rocm

[Codegen] Remove --iree-hip-waves-per-eu

This flag https://github.com/iree-org/iree/blob/d834aa7357179e0d806f3634d2efe3af2fa45171/compiler/plugins/target/ROCM/ROCMTarget.cpp#L93 sets a waves-per-eu attribute for llvm compilation on **every dispatch** to give the register allocator a hint. https://github.com/iree-org/iree/pull/17365 introduced a way to specify these llvm func attributes...

Groverkss

codegen

codegen/llvm

quality of life 😊

[Codegen] Remove --iree-codegen-gpu-native-math-precision flag

This flag https://github.com/iree-org/iree/blob/d834aa7357179e0d806f3634d2efe3af2fa45171/compiler/src/iree/compiler/Codegen/Common/PolynomialApproximationPass.cpp#L17 disables polynomial approximation for most math dialect operations, for hardware that supports these math operations directly. It looks like some backends rely on this flag for performance...

Groverkss

codegen

quality of life 😊

[GPU] Long compilation time/excessive ops generated

1

Some multi-reduction dispatches take a long time to compile. For context, https://github.com/iree-org/iree/issues/18479 identifies numerical issues with the current pipeline and https://github.com/iree-org/iree/pull/18519 should solve this issue. But the compilation time for...

IanWood1

codegen

Add shared pooling for `IREE_HAL_BUFFER_USAGE_CONSTANT` buffers.

2

Via a runtime system to allow for multiple instances of the same program to share constants. The complication with implicit sharing is that we only want two of the same...

benvanik

runtime

performance ⚡

hal/api

HTTP Cache Kubernetes Server Service

### Request description We want to introduce an http cache server to the kubernetes cluster as it will help with the build times for several jobs: linux_x64_clang in [ci_linux_x64_clang.yml](https://github.com/iree-org/iree/blob/main/.github/workflows/ci_linux_x64_clang.yml) linux_x64_clang_asan...

saienduri

enhancement ➕

infrastructure

[LLVMCPU] Add fold unit extent to CPU codegen pipeline

5

In case of broadcast + matmul kernels the outermost dimension (batch dim) is tiled to 1. We want to fold these into tensor.expand_shape after the distribution.

pashu123

Padding failures after LLVM bump

2

Failure is seen in following tests LLVMCPU/test/pipeline_pad_tests.mlir An example of IR from `pipeline_tile_and_fuse.mlir` is here https://gist.github.com/nirvedhmeshram/3349f2739dfb529fa4800040bf1c8490 It needs to be verified that the IR generated is what we want and...

nirvedhmeshram

Add pass to generalize pack ops if they are consumed by flow.dispatch.tensor.store ops

7

Pack ops can affect tiling decisions and hence it is beneficial to generalize them, for e.g for below IR ``` %5 = linalg.generic {indexing_maps = [affine_map (d0, d1)>], iterator_types =...

nirvedhmeshram

iree
iree copied to clipboard

Metadata

IREE Generic Vectorizer to support `i1` pattern in conv.

[Codegen] Remove --iree-llvmgpu-enable-prefetch

[Codegen] Remove --iree-hip-waves-per-eu

[Codegen] Remove --iree-codegen-gpu-native-math-precision flag

[GPU] Long compilation time/excessive ops generated

Add shared pooling for `IREE_HAL_BUFFER_USAGE_CONSTANT` buffers.

HTTP Cache Kubernetes Server Service

[LLVMCPU] Add fold unit extent to CPU codegen pipeline

Padding failures after LLVM bump

Add pass to generalize pack ops if they are consumed by flow.dispatch.tensor.store ops

← Metadata

Owner

Metadata

iree iree copied to clipboard

Metadata

← Metadata

Owner

Metadata

iree
iree copied to clipboard