rotary-embedding-torch icon indicating copy to clipboard operation
rotary-embedding-torch copied to clipboard

Implementation of Rotary Embeddings, from the Roformer paper, in Pytorch

Results 18 rotary-embedding-torch issues
Sort by recently updated
recently updated
newest added

Hi, @lucidrains ! There was a promising research published this month (vs. RoPE-mixed (#25) in March), the so-called LieRE positional encodings generalize the kv-vector rotation to any numbers of dimension...

Hi @lucidrains, These folks talk about improving axial-RoPE performance. Some comparisons to axial-RoPE look nice, but for some, I am not convinced. I wanted to get your thoughts on this....

My conclusions about changing the positional encoding are that NOPE and ALiBi do not work well for only-encoders because, compared to only-decoders, they do not understand position at all (they...

Hi @lucidrains, Thanks for creating this wonderful package as well as `x-transformers`. I wanted to understand why rotary embeddings seem to be slower for me than absolute positional embeddings. I'm...

`torch.compile` doesn't play nicely with amp autocasting and occasionally there are issues when exporting to onnx or other formats. Would explicit fasting to float and back be preferrable. This appears...

Hi! I'm running an enc-dec transformer with ROPE in the first self-attention layer of the encoder and decoder. I'm noticing that in the eval stage of my model, it hangs...

Hi @lucidrains We have trained a 3D ViT masked autoencoder using axial RoPE for an image size of 512x512x512 (3D scientific images, sampled from much larger volumes). Now I want...

@lucidrains, it would be really helpful to have an implementation of YaRN [(Peng _et al._)](https://openreview.net/forum?id=wHBfxhZu1u) in this repository as well.