openvino icon indicating copy to clipboard operation
openvino copied to clipboard

[NPUW] add GroupQueryAttention OP in npuw llm_compiled_model

Open bopeng1234 opened this issue 10 months ago • 1 comments

follow current implementation for ScaledDotProductAttentionDecomposition, https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_npu/src/plugin/npuw/llm_compiled_model.cpp#L221

add GroupQueryAttention OP decomposition logic for NPUW llm_compiled_model.

bopeng1234 avatar Jun 16 '25 09:06 bopeng1234

Hi @smirnov-alexey , can you help to review, thanks!

bopeng1234 avatar Jun 16 '25 09:06 bopeng1234

@esmirno could you please review?

smirnov-alexey avatar Jun 20 '25 11:06 smirnov-alexey

build_jenkins

smirnov-alexey avatar Jul 01 '25 11:07 smirnov-alexey