mlc-llm
mlc-llm copied to clipboard
[Tracking] RoPE scaling support
Overview
This issue tracks the support of RoPE scaling, one important configurable parameter adopted by many new models, in MLC LLM.
Action Items
- [ ] Support linear RoPE scaling in Llama in the current Llama modeling https://github.com/mlc-ai/mlc-llm/blob/e7d2ce6c0482076849e29e91e9b1e1e61c1ee277/mlc_llm/relax_model/llama.py#L212-L248 Report error for dynamic RoPE scaling as it is not as straightforward.
- [ ] Confirm the linear RoPE scaling works for models (e.g., chinese-alpaca-2-7b-16k).
- [ ] Support linear RoPE scaling in the new nn.Module and confirm https://github.com/mlc-ai/mlc-llm/blob/e7d2ce6c0482076849e29e91e9b1e1e61c1ee277/python/mlc_chat/compiler/model/llama/llama_model.py#L85-L99
Links to Related Issues and PRs
- reference implementation https://github.com/huggingface/transformers/blob/bd50402b56980ff17e957342ef69bd9b0dd45a7b/src/transformers/models/llama/modeling_llama.py#L152-L168
- https://github.com/mlc-ai/mlc-llm/issues/1327#issuecomment-1825941514
Hope somebody much smarter than me can help tackle this =)
Rope scaling support is now added to llama 3.1