Nirvedh Meshram comments

Results 51 comments of


                                            Nirvedh Meshram

[regression-onnx] Numeric regression for Conv ops

Surprisingly the error came down to this review comment in the original [PR ](https://github.com/iree-org/iree/pull/22523/changes#r2494669095) I am very surprised by this as this is about `overflow` flag in arith.addi When I...

[regression-onnx] Numeric regression for Conv ops

@zjgarvey might be but this issue is consistent and not intermittent, its happening in a very specific case of strided convs @bjacob Thanks, the issue is triaged to optimization after...

Support accumulating GEMMs in TileAndFuse with intrinsic without needing c promotion

Here is what is causing this to fail to bufferize, after `GPUFuseAndHoistParallelLoopsPass` We have the following access ``` %read_write_input = flow.dispatch.tensor.load ... -> tensor %workgroup_scf_forall = scf.forall ... shared_outs(%arg2 =...

Support accumulating GEMMs in TileAndFuse with intrinsic without needing c promotion

@MaheshRavishankar WDYT of the two suggestions from @hanhanW above. Based on our conversations previously we want to support accumulating GEMMS without transforming them to non-accumulating GEMM + elementwise add, That...

Unaligned K in Conv produces extra private allocations

@yzhang93 @Max191 wanted to share this issue with you in case you have plans to make this better already, I think we will need to resolve this one to get...

Unaligned K in Conv produces extra private allocations

@yzhang93 we would want it to look something like this ``` scf.for %arg7 = %c0 to %c2048 step %c128 { %23 = affine.apply affine_map (d0 + d1 + d2 *...

[codegen] [gpu]: SD3 MMDiT attention dispatch fails on LinalgExtToLoops for amdgpu targets

Sharing the smallest repro with which I am able to captures this issue ``` func.func @run_forward$async_dispatch_27_attention_2x1178x24x64xf16_generic(%12 : tensor, %13 : tensor, %14 : tensor) -> tensor { %cst = arith.constant...

[codegen] [gpu]: SD3 MMDiT attention dispatch fails on LinalgExtToLoops for amdgpu targets

I think the IR is too dated at this point and there is no `iree_linalg_ext.attention ` we would need to revisit this with fresh IR, @monorimet let me know if...

[codegen] [gpu]: SD3 MMDiT attention dispatch fails on LinalgExtToLoops for amdgpu targets

closing as issue (and the op which had the issue) dont seem to be there anymore.

Unify all empty pages so that they use the same component

@krzysz00 we are still relying on the flag https://github.com/iree-org/iree/blob/main/compiler/plugins/target/ROCM/ROCMTarget.cpp#L672-L678 We would have to drop the flag and check if the issue is still reproducible