annotated_deep_learning_paper_implementations Fix RoPE inner product equation & add note on the difference in implementation

Fix RoPE inner product equation & add note on the difference in implementation

Open thanhtcptit opened this issue 7 months ago • 0 comments

Hi, thank you for your work. I noticed an error in the RoPE inner product equation. Additionally, this implementation uses a different feature pairing strategy for feature subspaces rotation compared to the original paper, which I believe is worth noting to avoid confusion. Ref: https://github.com/pytorch/torchtune/blob/main/torchtune/modules/position_embeddings.py#L117

Cheer,

Jul 18 '24 05:07 thanhtcptit

annotated_deep_learning_paper_implementations annotated_deep_learning_paper_implementations copied to clipboard

Fix RoPE inner product equation & add note on the difference in implementation

annotated_deep_learning_paper_implementations
annotated_deep_learning_paper_implementations copied to clipboard