lmdeploy icon indicating copy to clipboard operation
lmdeploy copied to clipboard

[Bug] InternVL3-8B-hf模型 使用lmdeploy脚本进行量化报错

Open the-nine-nation opened this issue 4 months ago • 2 comments

Checklist

  • [x] 1. I have searched related issues but cannot get the expected help.
  • [x] 2. The bug has not been fixed in the latest version.
  • [x] 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.

Describe the bug

将该模型使用lora训练、合并,再使用lmdeploy进行awq量化后失败,报错如下:

2025-08-08 06:20:51,227 - lmdeploy - INFO - builder.py:65 - matching vision model: InternVL3VisionModel Traceback (most recent call last): File "/home/lzy/miniforge3/envs/lmdeploy/bin/lmdeploy", line 8, in sys.exit(run()) File "/home/lzy/miniforge3/envs/lmdeploy/lib/python3.10/site-packages/lmdeploy/cli/entrypoint.py", line 39, in run args.run(args) File "/home/lzy/miniforge3/envs/lmdeploy/lib/python3.10/site-packages/lmdeploy/cli/lite.py", line 111, in auto_awq auto_awq(**kwargs) File "/home/lzy/miniforge3/envs/lmdeploy/lib/python3.10/site-packages/lmdeploy/lite/apis/auto_awq.py", line 86, in auto_awq vl_model, model, tokenizer, work_dir = calibrate(model, File "/home/lzy/miniforge3/envs/lmdeploy/lib/python3.10/site-packages/lmdeploy/lite/apis/calibrate.py", line 253, in calibrate vl_model = load_vl_model(model, backend=None, with_llm=True).vl_model File "/home/lzy/miniforge3/envs/lmdeploy/lib/python3.10/site-packages/lmdeploy/vl/model/builder.py", line 71, in load_vl_model model.build_model() File "/home/lzy/miniforge3/envs/lmdeploy/lib/python3.10/site-packages/lmdeploy/vl/model/internvl3_hf.py", line 75, in build_model load_checkpoint_and_dispatch(model=model, File "/home/lzy/miniforge3/envs/lmdeploy/lib/python3.10/site-packages/accelerate/big_modeling.py", line 642, in load_checkpoint_and_dispatch return dispatch_model( File "/home/lzy/miniforge3/envs/lmdeploy/lib/python3.10/site-packages/accelerate/big_modeling.py", line 502, in dispatch_model model.to(device) File "/home/lzy/miniforge3/envs/lmdeploy/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3851, in to return super().to(*args, **kwargs) File "/home/lzy/miniforge3/envs/lmdeploy/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1343, in to return self._apply(convert) File "/home/lzy/miniforge3/envs/lmdeploy/lib/python3.10/site-packages/torch/nn/modules/module.py", line 903, in _apply module._apply(fn) File "/home/lzy/miniforge3/envs/lmdeploy/lib/python3.10/site-packages/torch/nn/modules/module.py", line 903, in _apply module._apply(fn) File "/home/lzy/miniforge3/envs/lmdeploy/lib/python3.10/site-packages/torch/nn/modules/module.py", line 930, in _apply param_applied = fn(param) File "/home/lzy/miniforge3/envs/lmdeploy/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1336, in convert raise NotImplementedError( NotImplementedError: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device.

Reproduction

lmdeploy lite auto_awq InternVL3-8B-hf-sft --work-dir . --dtype bfloat16

Environment

lmdeploy                  0.9.2
gpu:4090

Error traceback


the-nine-nation avatar Aug 08 '25 06:08 the-nine-nation

同问,相同的报错,InternVL3-8B就可以直接量化

zzb213213 avatar Aug 19 '25 05:08 zzb213213

目前已经解决了问题,首先将hf模型转化为custom模型,使用官方的脚本,然后进行量化;(要修改transformers和datasets版本) 但是量化后推理效果非常之离谱,尽管格式学对了,可推理的值像随机填写的一样,是不是量化时对齐的数据集要进行修改?

the-nine-nation avatar Aug 26 '25 01:08 the-nine-nation