mistral-inference
mistral-inference copied to clipboard
Q: Why rotary embedding applied only to queries and keys?
In the codebase, rotary embeddings are applied only to queries and keys but not to values. Can someone point out to reasons/papers behind this design? Thank you in advance!!!