lmdeploy [Bug] internvl2-2B awq w4a16量化后掉点严重，应该如何排查？

Checklist

[X] 1. I have searched related issues but cannot get the expected help.
[X] 2. The bug has not been fixed in the latest version.
[X] 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.

Describe the bug

我使用LMDeploy awq量化sft后的internvl2-2B后，掉点严重，应该如何排查，有没有排查的SOP或者文档？

Reproduction

lmdeploy lite auto_awq
$HF_MODEL
--calib-samples 128
--calib-seqlen 1024
--w-bits 4
--w-group-size 128
--batch-size 128
--search-scale False
--work-dir $WORK_DIR

Environment

A800-80G
CUDA11.8

Error traceback

No response

Aug 08 '24 06:08 Howe-Young

试试 lmdeploy lite auto_awq $HF_MODEL --work-dir $WORK_DIR。掉点多少？排查的话，可以用 pytorch engine 跑这个 awq 模型，然后逐 layer 比对原始数据，定位问题。

Aug 15 '24 04:08 AllentDan

掉点严重

You can run an eval of gsm8k before and after quantization to see how many percentage points it specifically drops.

Aug 15 '24 07:08 zhyncs

This issue is marked as stale because it has been marked as invalid or awaiting response for 7 days without any further response. It will be closed in 5 days if the stale label is not removed or if there is no further response.

Aug 23 '24 02:08 github-actions[bot]

This issue is closed because it has been stale for 5 days. Please open a new issue if you have similar issues or you have any new updates now.

Aug 28 '24 02:08 github-actions[bot]