aphrodite-engine icon indicating copy to clipboard operation
aphrodite-engine copied to clipboard

Add RoPE scaling arguments to engine

Open AlpinDale opened this issue 1 year ago • 4 comments

Currently, we auto-scale using the --max-model-len argument. It may be more appropriate to have specific options for the scaling factor, etc.

AlpinDale avatar Jan 26 '24 20:01 AlpinDale

There are some models for long context tasks like storywriting that it'd be nice to use with a static RoPE scaling factor. +1 on this!

jagilley avatar Feb 24 '24 00:02 jagilley

Hi getting error for bigger context models like Microsoft Phi 3 medium with respect to rope scaling factors with exl2 format.

sparsh35 avatar May 24 '24 02:05 sparsh35

It is something related to this I think, maybe not much needs to be done here, just implement this code , I will try to test if it doesn't breaks anything else , here is the git in vllm for this feature https://github.com/vllm-project/vllm/pull/4638

sparsh35 avatar May 24 '24 02:05 sparsh35

https://github.com/vllm-project/vllm/pull/4298 vllm has implemented rotatry scale embeddings like this

sparsh35 avatar May 24 '24 02:05 sparsh35

Added in v0.6.0.

AlpinDale avatar Sep 03 '24 13:09 AlpinDale