iree icon indicating copy to clipboard operation
iree copied to clipboard

[LLVMGPU] VectorDistribution pipeline for attention

Open Groverkss opened this issue 1 year ago • 2 comments

This patch adds support for lowering attention through VectorDistribution pipeline. Currently, it has the following limitations which will be fixed as followups:

  • Only 1 subgroup is used
  • There are 4 shared memory promotions. Based on the intrinsic, this can be reduced to 2 or 3.

Depends on https://github.com/iree-org/iree/pull/17744

Groverkss avatar Jun 21 '24 16:06 Groverkss

Please only review the last commit. Other commits are the patches this patch depends on.

Groverkss avatar Jun 21 '24 16:06 Groverkss

I dont have any comments on this, skimming through this looks OK to me to land and iterate

MaheshRavishankar avatar Jun 27 '24 22:06 MaheshRavishankar

Closing in favour of https://github.com/iree-org/iree/pull/17773

Groverkss avatar Jul 22 '24 07:07 Groverkss