lmdeploy
lmdeploy copied to clipboard
[Bug] Llama 3.1 Support
Checklist
- [x] 1. I have searched related issues but cannot get the expected help.
- [ ] 2. The bug has not been fixed in the latest version.
- [ ] 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
Describe the bug
Running into errors when running latest llama3.1 awq model with latest docker image. I believe there may need to be support added for this model?
Reproduction
docker run --runtime nvidia --gpus '"device=2"' -v ~/.cache/huggingface:/root/.cache/huggingface --env "HUGGING_FACE_HUB_TOKEN=TOKEN" -p 23333:23333 --ipc=host openmmlab/lmdeploy:latest lmdeploy serve api_server hugging-quants/Meta-Llama-3.1-70B-Instruct-AWQ-INT4 --backend turbomind --model-format awq
Environment
Latest docker cloned
Error traceback
No response