pytorch-llama icon indicating copy to clipboard operation
pytorch-llama copied to clipboard

Error in rotary matrix multiplication formula of slide 25

Open sanzgadea opened this issue 1 year ago • 0 comments

First of all, thank you for the great resources and Youtube videos.

I wanted to point out that in slide 25 of the Llama notes, regarding the computational efficient realization of rotary matrix multiplication, there is a typo in the third vector, as the order of the last two entries ($x_d$ and $x_{d-1}$) is inverted.

This is what is currently in the slide:

R_{\theta, m}^d x = \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \\ ... \\ x_{d-1} \\ x_d  \end{bmatrix} \otimes \begin{bmatrix} cos (m\theta_1) \\ cos (m\theta_1) \\ cos (m\theta_2) \\ cos (m\theta_2) \\ ... \\ cos (m\theta_{d/2}) \\\ cos (m\theta_{d/2})\end{bmatrix} + \begin{bmatrix} -x_2 \\ x_1 \\ -x_4 \\ x_3 \\ ... \\ \mathbf{-x_{d-1}} \\ \mathbf{x_{d}}  \end{bmatrix} \otimes \begin{bmatrix} sin (m\theta_1) \\ sin (m\theta_1) \\ sin (m\theta_2) \\ sin (m\theta_2) \\ ... \\ sin (m\theta_{d/2}) \\\ sin (m\theta_{d/2})\end{bmatrix}

And this is how it should be:

R_{\theta, m}^d x = \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \\ ... \\ x_{d-1} \\ x_d  \end{bmatrix} \otimes \begin{bmatrix} cos (m\theta_1) \\ cos (m\theta_1) \\ cos (m\theta_2) \\ cos (m\theta_2) \\ ... \\ cos (m\theta_{d/2}) \\\ cos (m\theta_{d/2})\end{bmatrix} + \begin{bmatrix} -x_2 \\ x_1 \\ -x_4 \\ x_3 \\ ... \\ \mathbf{-x_{d}} \\ \mathbf{x_{d-1}}  \end{bmatrix} \otimes \begin{bmatrix} sin (m\theta_1) \\ sin (m\theta_1) \\ sin (m\theta_2) \\ sin (m\theta_2) \\ ... \\ sin (m\theta_{d/2}) \\\ sin (m\theta_{d/2})\end{bmatrix}

I imagine the error was already in the original paper and the screenshot comes from there. I have checked the RoFormer paper (page 7 equation 34) and they have since fixed the error.

sanzgadea avatar Apr 23 '24 09:04 sanzgadea