aphrodite-engine
aphrodite-engine copied to clipboard
Add RoPE scaling arguments to engine
Currently, we auto-scale using the --max-model-len argument. It may be more appropriate to have specific options for the scaling factor, etc.
There are some models for long context tasks like storywriting that it'd be nice to use with a static RoPE scaling factor. +1 on this!
Hi getting error for bigger context models like Microsoft Phi 3 medium with respect to rope scaling factors with exl2 format.
It is something related to this I think, maybe not much needs to be done here, just implement this code , I will try to test if it doesn't breaks anything else , here is the git in vllm for this feature https://github.com/vllm-project/vllm/pull/4638
https://github.com/vllm-project/vllm/pull/4298 vllm has implemented rotatry scale embeddings like this
Added in v0.6.0.