gpt-fast icon indicating copy to clipboard operation
gpt-fast copied to clipboard

Update sdpa function with enable_gqa=True

Open jainapurva opened this issue 1 year ago • 1 comments

For the llama model, in the sdpa function call, set enable_gqa=True to use the inbuilt grouped query attention functionality

jainapurva avatar Jul 13 '24 03:07 jainapurva