Han-Chung Wang

Results 73 issues of Han-Chung Wang

This is the IR from MobileNetV3. These two (element-wise) generic ops are able to be fused, but they are not. We'd like to fuse them into a single generic op,...

codegen

We've been hitting issues about vectorizing table lookups. I had an offline discussion with @MaheshRavishankar . The main issue is that we don't handle `tensor.extract` op in Linalg vectorization. There...

help wanted
codegen
codegen/llvm

### What happened? I found that there are redundant buffer allocation in LinalgExt ops. It also happens in normal Linalg ops. The main issue is that a constant op is...

bug 🐞
codegen

The pass only handles Conv2DNhwcHwcfOp case. We should generalize it to handle nchw cases. File an issue for tracking it.

codegen

Follow up from https://github.com/google/iree/issues/8411, the quantized convolution ops are not vectorized. This introduces temp buffer allocation because types mismatch. We landed https://github.com/google/iree/pull/8526 to work it around. Ideally, we'd like to...

codegen
codegen/llvm

The benchmarks are tracked under experimental-flags.

buildkite:benchmark
buildkite:benchmark-x86_64
buildkite:benchmark-riscv

buildkite:benchmark
buildkite:benchmark-x86_64

IREE is 2x slower than baseline on c2-standard-16 for single-threaded (526 ms v.s. 278 ms). Some quantized GEMMs are very slow in this case. ## dispatch_8_matmul_384x128x512 This is a fill...

codegen/llvm