SnapKV icon indicating copy to clipboard operation
SnapKV copied to clipboard

why only decode do compress?

Open CSEEduanyu opened this issue 1 year ago • 1 comments

@leeyeehoo @ctlllll @WendyH1108

CSEEduanyu avatar May 22 '24 03:05 CSEEduanyu

I tried using only the pruned tokens for the first token, and the performance was extremely poor. I believe that's why SnapKV uses full KV for the prefill attention.

xinhaoH avatar Jul 29 '25 11:07 xinhaoH