Eayne
Results
1
issues of
Eayne
when using continuous kv cache, gpt_attention will only use first past_key_value instead of past_key_value[selected_indexed]. It will cause calculating result errors when the values of continous kv caches are not zeros.
Community want to contribute