rotary-embedding-torch
rotary-embedding-torch copied to clipboard
RoPE-Mixed: Improvement over Axial for n-D
Hi @lucidrains,
These folks talk about improving axial-RoPE performance. Some comparisons to axial-RoPE look nice, but for some, I am not convinced. I wanted to get your thoughts on this. If it makes sense, can we integrate this into the repo?
https://arxiv.org/abs/2403.13298