lmdeploy icon indicating copy to clipboard operation
lmdeploy copied to clipboard

[Feature] 给 lmdeploy pytorch引擎,添加一个权重参数加载精度的参数。

Open hello-gary-2022 opened this issue 1 year ago • 3 comments

Motivation

硬件 T4 ,16G

需求:

给pytorch添加一个加载精度的参数,类似:vllm 的 dtype = [--dtype {auto,half,float16,bfloat16,float,float32}] ,让用户可以主动根据硬件能力选择 加载/推理 精度。

命令

lmdeploy serve api_server Qwen/Qwen1.5-1.8B-Chat --server-port 23333 --cache-max-entry-count 0.5

错误

2024-04-06 01:12:36,466 - lmdeploy - ERROR - AssertionError: bf16 is not supported on your device 2024-04-06 01:12:36,466 - lmdeploy - ERROR - <Model> test failed! Your device does not support torch.bfloat16. Try edit torch_dtype in config.json. Note that this might have negative effect!

Related resources

No response

Additional context

No response

hello-gary-2022 avatar Apr 06 '24 08:04 hello-gary-2022

改起来比较花时间,急用的话可以先改 config.json

grimoire avatar Apr 07 '24 05:04 grimoire

OK

---- 回复的原邮件 ---- | 发件人 | @.> | | 日期 | 2024年04月07日 13:21 | | 收件人 | @.> | | 抄送至 | @.>@.> | | 主题 | Re: [InternLM/lmdeploy] [Feature] 给 lmdeploy pytorch引擎,添加一个权重参数加载精度的参数。 (Issue #1398) |

改起来比较花时间,急用的话可以先改 config.json

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

hello-gary-2022 avatar Apr 08 '24 02:04 hello-gary-2022

请问一下是修改哪个config.json,没有找到

George-TQL avatar Jul 23 '24 07:07 George-TQL