Zhang Kehao
Results
2
comments of
Zhang Kehao
Also for line 75
In cache_utils.py, I noticed that ` keys_to_keep = self.key_cache[layer_idx][ :, :, -self.window_length + self.num_sink_tokens + key_states.shape[-2] : ] ` might go wrong when -self.window_length + self.num_sink_tokens + key_states.shape[-2] >= 0...