mlc-llm [Tracking] RoPE scaling support

[Tracking] RoPE scaling support

Open MasterJH5574 opened this issue 1 year ago • 2 comments

This issue tracks the support of RoPE scaling, one important configurable parameter adopted by many new models, in MLC LLM.

[ ] Support linear RoPE scaling in Llama in the current Llama modeling https://github.com/mlc-ai/mlc-llm/blob/e7d2ce6c0482076849e29e91e9b1e1e61c1ee277/mlc_llm/relax_model/llama.py#L212-L248 Report error for dynamic RoPE scaling as it is not as straightforward.
[ ] Confirm the linear RoPE scaling works for models (e.g., chinese-alpaca-2-7b-16k).
[ ] Support linear RoPE scaling in the new nn.Module and confirm https://github.com/mlc-ai/mlc-llm/blob/e7d2ce6c0482076849e29e91e9b1e1e61c1ee277/python/mlc_chat/compiler/model/llama/llama_model.py#L85-L99

reference implementation https://github.com/huggingface/transformers/blob/bd50402b56980ff17e957342ef69bd9b0dd45a7b/src/transformers/models/llama/modeling_llama.py#L152-L168
https://github.com/mlc-ai/mlc-llm/issues/1327#issuecomment-1825941514

Nov 29 '23 01:11 MasterJH5574

Hope somebody much smarter than me can help tackle this =)

Jul 19 '24 08:07 0xDEADFED5

Rope scaling support is now added to llama 3.1

Jul 26 '24 12:07 tqchen