[Feature] 给 lmdeploy pytorch引擎,添加一个权重参数加载精度的参数。
Motivation
硬件 T4 ,16G
需求:
给pytorch添加一个加载精度的参数,类似:vllm 的 dtype = [--dtype {auto,half,float16,bfloat16,float,float32}] ,让用户可以主动根据硬件能力选择 加载/推理 精度。
命令
lmdeploy serve api_server Qwen/Qwen1.5-1.8B-Chat --server-port 23333 --cache-max-entry-count 0.5
错误
2024-04-06 01:12:36,466 - lmdeploy - ERROR - AssertionError: bf16 is not supported on your device
2024-04-06 01:12:36,466 - lmdeploy - ERROR - <Model> test failed!
Your device does not support torch.bfloat16. Try edit torch_dtype in config.json.
Note that this might have negative effect!
Related resources
No response
Additional context
No response
改起来比较花时间,急用的话可以先改 config.json
OK
---- 回复的原邮件 ---- | 发件人 | @.> | | 日期 | 2024年04月07日 13:21 | | 收件人 | @.> | | 抄送至 | @.>@.> | | 主题 | Re: [InternLM/lmdeploy] [Feature] 给 lmdeploy pytorch引擎,添加一个权重参数加载精度的参数。 (Issue #1398) |
改起来比较花时间,急用的话可以先改 config.json
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>
请问一下是修改哪个config.json,没有找到