onnxruntime
onnxruntime copied to clipboard
[JS/WebGPU] GroupQueryAttention rewrite
Description
Implement JSEP GroupQueryAttention
Motivation and Context
Required to enable certain LLM models to run using WebGPU.