[Bug] internvl2-2B awq w4a16量化后掉点严重,应该如何排查?
Checklist
- [X] 1. I have searched related issues but cannot get the expected help.
- [X] 2. The bug has not been fixed in the latest version.
- [X] 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
Describe the bug
我使用LMDeploy awq量化sft后的internvl2-2B后,掉点严重,应该如何排查,有没有排查的SOP或者文档?
Reproduction
lmdeploy lite auto_awq
$HF_MODEL
--calib-samples 128
--calib-seqlen 1024
--w-bits 4
--w-group-size 128
--batch-size 128
--search-scale False
--work-dir $WORK_DIR
Environment
A800-80G
CUDA11.8
Error traceback
No response
试试 lmdeploy lite auto_awq $HF_MODEL --work-dir $WORK_DIR。掉点多少?排查的话,可以用 pytorch engine 跑这个 awq 模型,然后逐 layer 比对原始数据,定位问题。
掉点严重
You can run an eval of gsm8k before and after quantization to see how many percentage points it specifically drops.
This issue is marked as stale because it has been marked as invalid or awaiting response for 7 days without any further response. It will be closed in 5 days if the stale label is not removed or if there is no further response.
This issue is closed because it has been stale for 5 days. Please open a new issue if you have similar issues or you have any new updates now.