pytorch-llama
pytorch-llama copied to clipboard
Error in rotary matrix multiplication formula of slide 25
First of all, thank you for the great resources and Youtube videos.
I wanted to point out that in slide 25 of the Llama notes, regarding the computational efficient realization of rotary matrix multiplication, there is a typo in the third vector, as the order of the last two entries ($x_d$ and $x_{d-1}$) is inverted.
This is what is currently in the slide:
R_{\theta, m}^d x = \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \\ ... \\ x_{d-1} \\ x_d \end{bmatrix} \otimes \begin{bmatrix} cos (m\theta_1) \\ cos (m\theta_1) \\ cos (m\theta_2) \\ cos (m\theta_2) \\ ... \\ cos (m\theta_{d/2}) \\\ cos (m\theta_{d/2})\end{bmatrix} + \begin{bmatrix} -x_2 \\ x_1 \\ -x_4 \\ x_3 \\ ... \\ \mathbf{-x_{d-1}} \\ \mathbf{x_{d}} \end{bmatrix} \otimes \begin{bmatrix} sin (m\theta_1) \\ sin (m\theta_1) \\ sin (m\theta_2) \\ sin (m\theta_2) \\ ... \\ sin (m\theta_{d/2}) \\\ sin (m\theta_{d/2})\end{bmatrix}
And this is how it should be:
R_{\theta, m}^d x = \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \\ ... \\ x_{d-1} \\ x_d \end{bmatrix} \otimes \begin{bmatrix} cos (m\theta_1) \\ cos (m\theta_1) \\ cos (m\theta_2) \\ cos (m\theta_2) \\ ... \\ cos (m\theta_{d/2}) \\\ cos (m\theta_{d/2})\end{bmatrix} + \begin{bmatrix} -x_2 \\ x_1 \\ -x_4 \\ x_3 \\ ... \\ \mathbf{-x_{d}} \\ \mathbf{x_{d-1}} \end{bmatrix} \otimes \begin{bmatrix} sin (m\theta_1) \\ sin (m\theta_1) \\ sin (m\theta_2) \\ sin (m\theta_2) \\ ... \\ sin (m\theta_{d/2}) \\\ sin (m\theta_{d/2})\end{bmatrix}
I imagine the error was already in the original paper and the screenshot comes from there. I have checked the RoFormer paper (page 7 equation 34) and they have since fixed the error.