gpt-fast icon indicating copy to clipboard operation
gpt-fast copied to clipboard

Naming: n_local_heads -> n_kv_heads

Open ad8e opened this issue 1 year ago • 0 comments

n_local_heads refers to TP sharding, rather than GQA.

ad8e avatar Apr 23 '24 18:04 ad8e