Kunwar Grover comments

Results 36 comments of


                                            Kunwar Grover

[LinalgExt] Remove attention tile and decompose

> Good to have this cleanup, but IIRC @harsh-nod mentioned there are cases where we found regular, non FA faster, so tileAndDecomposeAttention may still be useful there? In those cases,...

[LinalgExt] Remove attention tile and decompose

Already landed as part of https://github.com/iree-org/iree/commit/dd3f2a392819d121fa5329a1c591be06ae9e887a

[ROCM][Tracker] Wan2.1 Autoencoder3d performance - MI300x

I looked at the attention IR, it's going down the memory bound attention pipeline. The reason is that our attention/mma heuristics are not best at checking if the copy from...

[GPU] Support multiple contraction dims in MmaSchedules

> > While VectorDistribute doesn't support multiple dimensions for subgroup dims, can we try to keep the configuration logic similar to TileAndFuse? We plan to soon support that, and It...

[Codegen] TileAndDistributeToWorkgroups for operations with multiple results and related producers

Just to note, there is another route that we can take. The reason there are issues is that we are doing fusion greedily. Instead, we could do any analysis to...

[LinalgExt][Fusion] Fusion of attention + reshape on reduction dim causes lowering error

This is an easy fix, you can just assign batch dimensions to that addition dimension. I can fix it, but happy to let someone else also try fixing it.

[LinalgExt][Fusion] Fusion of attention + reshape on reduction dim causes lowering error

Me and Stanley talked offline and this should be a simple codegen change. We can just assign batch dimensions for the layout here: https://github.com/iree-org/iree/blob/main/compiler/src/iree/compiler/Codegen/Dialect/GPU/IR/IREEGPUAttrs.cpp#L1204

[LinalgExt][Fusion] Fusion of attention + reshape on reduction dim causes lowering error

Fixed by https://github.com/iree-org/iree/pull/18868

[LLVMCPU] Add fold unit extent to CPU codegen pipeline

Just to signal, we had some problems with this pass in gpu pipeline because it drops lowering_config from linalg operations. Maybe it doesn't apply here, but still would like to...

Enable tensor ukernels by default

@jtuyls I added a ci-extra trailer so the torch ci will run on this again and retriggered the ci. I'd recommend disabling the ukernel flags (if enabled) in the ci...