Kunwar Grover issues

Results 41 issues of


                                            Kunwar Grover

[LLVMGPU] VectorDistribution pipeline for attention

This patch adds support for lowering attention through VectorDistribution pipeline. Currently, it has the following limitations which will be fixed as followups: - Only 1 subgroup is used - There...

[LinalgExt] Remove attention tile and decompose

Depends on: https://github.com/iree-org/iree/pull/17536

Add loop-invariant-subset-hoisting pass to iree-opt

This pass is similar to loop-invariant-code-motion but loops for loop invariant subsets instead. This pass is required for post-vectorization cleanups.

[LinalgExt] Add indexing maps on iree_linalg_ext.attention

Depends on https://github.com/iree-org/iree/pull/17626

[VectorDistribution] Add support for multi-subgroup attention

This patch adds support for distributing attention to multiple subgroups. Some points to note: - Due to some issues with layout analysis, we cannot yet do multiple n subgroups. This...

[Codegen] Add arith.clip/arith.clamp

Clipping or clamping is defined as: ``` clip(x, min_value, max_value) = min(max(x, min_value), max_value) ``` Some backends can generate better instructions if it's known we are clamping a value. For...

enhancement ➕

codegen

onboarding/codegen

[GPU] Clustered Subgroup Reduction

### Request description # Motivation A pattern we notice in flash attention kernels is: ``` A: tensor B: tensor C: tensor D : tensor = matmul(A, B, C) E :...