lmdeploy [Feature] 给 lmdeploy pytorch引擎，添加一个权重参数加载精度的参数。

Motivation

硬件 T4 ，16G

需求：

给pytorch添加一个加载精度的参数，类似：vllm 的 dtype = [--dtype {auto,half,float16,bfloat16,float,float32}] ，让用户可以主动根据硬件能力选择加载/推理精度。

命令

lmdeploy serve api_server Qwen/Qwen1.5-1.8B-Chat --server-port 23333 --cache-max-entry-count 0.5

错误

2024-04-06 01:12:36,466 - lmdeploy - ERROR - AssertionError: bf16 is not supported on your device 2024-04-06 01:12:36,466 - lmdeploy - ERROR - <Model> test failed! Your device does not support torch.bfloat16. Try edit torch_dtype in config.json. Note that this might have negative effect!

Related resources

No response

Additional context

No response

Apr 06 '24 08:04 hello-gary-2022

改起来比较花时间，急用的话可以先改 config.json

Apr 07 '24 05:04 grimoire

OK

---- 回复的原邮件 ---- | 发件人 | @.> | | 日期 | 2024年04月07日 13:21 | | 收件人 | @.> | | 抄送至 | @.>@.> | | 主题 | Re: [InternLM/lmdeploy] [Feature] 给 lmdeploy pytorch引擎，添加一个权重参数加载精度的参数。 (Issue #1398) |

改起来比较花时间，急用的话可以先改 config.json

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

Apr 08 '24 02:04 hello-gary-2022

请问一下是修改哪个config.json，没有找到

Jul 23 '24 07:07 George-TQL