Han-Chung Wang
Han-Chung Wang
It introduces `inferVectorSizesFromIR(Value val)` method, so every op can infer the input vector size through the chain. This is important for fusion cases (e.g., `generic + pack`) because it assumed...
I'm prototyping on pack op direct vectorization, and found that some transfer_write/read (with masks) are not folded away. IR: ```mlir func.func @main_dispatch_0_generic_32x128xD_f32xbf16_pack() { %cst = arith.constant 0.000000e+00 : bf16 %c32_i64...
Coming from https://github.com/openxla/iree/issues/15661#issuecomment-1854762283, we observed that there is a bug in PolynomialApproximation pass. I landed a [workaround](https://github.com/openxla/iree/commit/a4a6b4bb74df601ccd558ccc658fa599eae559f3), which rewrite f16 approximations to occur with f32 intermediates. File a new issue...
We observed that the vectorization of reverse-like tensor.extract op was wrong in https://github.com/openxla/iree/issues/16544. Input: ```mlir func.func @foo_dispatch_0_generic_2x1x3_f32() { %c1 = arith.constant 1 : index %c0 = arith.constant 0 : index...
Coming from https://github.com/google/iree/issues/8712 I found that the traces are not generated when running the test target. It's really inconvenient when debugging the issue. I have a commit which dumps inputs...
## Overview This is the umbrella issue that collects tasks toward phase 1. In the phase 1, we aim to provide a functional data-tiling GPU path with reasonable performance. In...
@Max191 and I looked at enabling PadAndVectorDistribution pipeline and found that it failed in vector distribution in one of cases. To repro: `iree-opt --pass-pipeline='builtin.module(func.func(iree-llvmgpu-vector-distribute{test-layout}, canonicalize, cse))' ~/repro.mlir` ```mlir func.func @foo()...