SnapKV icon indicating copy to clipboard operation
SnapKV copied to clipboard

Confusion in Selecting Indices

Open Lueci4er opened this issue 8 months ago • 3 comments

Thank you for your excellent work. I have encountered some confusion while trying to reproduce the results:

In snapkv_utils.py, the following snippet is used to select indices and gather corresponding values: snapkv_utils.py#L62-L65

            indices = attn_cache.topk(self.max_capacity_prompt - self.window_size, dim=-1).indices
            indices = indices.unsqueeze(-1).expand(-1, -1, -1, head_dim)
            k_past_compress = key_states[:, :, :-self.window_size, :].gather(dim = 2, index = indices)
            v_past_compress = value_states[:, :, :-self.window_size, :].gather(dim = 2, index = indices)

The indices here are selected based on the scores in attn_cache, which are sorted by their values, rather than by their original positional order.

My concern is whether this approach could lead to incorrect ordering in the KV cache when performing the gather operation for k_past_compress and v_past_compress. Specifically, could this potentially disrupt the alignment between keys and values during subsequent inference steps?

I would appreciate any insights or clarifications on this matter. Thank you!

Lueci4er avatar Apr 25 '25 05:04 Lueci4er

Hi, I had the same confusion about this part while writing code today. May I ask if you’ve resolved your question?

wrnwwrnw avatar Jun 19 '25 06:06 wrnwwrnw

Hi, I had the same confusion about this part while writing code today. May I ask if you’ve resolved your question?

I guess it probably won't affect performance much since it's only used once during the prefilling phase. During the generation phase, it will not interfere with the current decoding step due to the causal mask in autoregressive models.

However, if it is used multiple times during generation, it might influence the effectiveness of the pooling mechanism in SnapKV.

Lueci4er avatar Jun 19 '25 06:06 Lueci4er

Thank you for your explanation

wrnwwrnw avatar Jun 19 '25 06:06 wrnwwrnw