Han-Chung Wang

Results 336 comments of Han-Chung Wang

It looks like it failed in SetEncoding (or related passes). @pashu123 given that you want to get more involved in these tasks, would you like to triage the issue when...

It is not compiled because vector.gather is lowered to a lot of vector ops -- which should be fixed. The other issue is that we are having two generic ops...

I think there are still action items in the issue, the look-up table fusion is scaring me. We should fix that at least. The tile sizes for vector.gather are problematic....

Confirmed that the fusion is not expected. @MaheshRavishankar will fix it. For the gather codegen issue, @pashu123 could you create a input case for the generic op and see what's...

Have you tried tensor constants and tensor.extract op? We are able to vectorize `tensor.extract` using vector.gather/vector.transfer_read ops. To repro: `iree-opt --pass-pipeline="builtin.module(func.func(iree-codegen-generic-vectorization{enable-vector-masking=false use-configured-vector-sizes=false}))" ~/z.mlir` ```mlir func.func @main(%30 : tensor

I see, feel free to reach out if there are any vectorization issues/questions. I'm happy to help. The flags are available with iree-compile. You'll need to add the options like:...

Putting an additional resource here before I forget it. We can review it together when we're discussing details. Here is an upstream method which has very helpful logics. Ideally, we...

Use the issue because it already has some context. E.g., the upstream method that we can use in the work. In the recent LLVMGPUTileAndFuse pipeline ([pipeline_test](https://github.com/iree-org/iree/blob/main/compiler/src/iree/compiler/Codegen/LLVMGPU/test/ROCDL/pipeline_tile_and_fuse.mlir)), we already generates some...

Interesting result... We don't see regressions on other backends because we don't track them in our CI. Perhaps we should check if it regresses sdxl or not. Is @qedawkins the...

Thanks! FYI that this is generated with https://github.com/iree-org/iree/pull/17234 + https://github.com/iree-org/iree/pull/17264 We need to use `llvm::divideCeil` to compute the number of tiles. It is only done in GPUHeuristics.cpp, but not other...