LLMs-from-scratch icon indicating copy to clipboard operation
LLMs-from-scratch copied to clipboard

Fix bug in masking when kv cache is used.

Open martinzwm opened this issue 4 months ago • 3 comments

Thank you for creating this project, I learned a lot from it!

There seems to be a small bug during masking when kv cache is enabled:

  • W/o kv cache, mask_bool = self.mask.bool()[:num_tokens, :num_tokens] yields to intended results.
  • W/ kv cache, num_tokens would be set to 1, and mask_bool would be a tensor of shape (1, 1). However, we want the mask_bool to be a tensor of shape (1, num_tokens_K).

The following changes address this bug.

martinzwm avatar Jun 22 '25 21:06 martinzwm