lmdeploy icon indicating copy to clipboard operation
lmdeploy copied to clipboard

[Bug] internvl2-2B awq w4a16量化后掉点严重,应该如何排查?

Open Howe-Young opened this issue 1 year ago • 2 comments

Checklist

  • [X] 1. I have searched related issues but cannot get the expected help.
  • [X] 2. The bug has not been fixed in the latest version.
  • [X] 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.

Describe the bug

我使用LMDeploy awq量化sft后的internvl2-2B后,掉点严重,应该如何排查,有没有排查的SOP或者文档?

Reproduction

lmdeploy lite auto_awq
$HF_MODEL
--calib-samples 128
--calib-seqlen 1024
--w-bits 4
--w-group-size 128
--batch-size 128
--search-scale False
--work-dir $WORK_DIR

Environment

A800-80G
CUDA11.8

Error traceback

No response

Howe-Young avatar Aug 08 '24 06:08 Howe-Young

试试 lmdeploy lite auto_awq $HF_MODEL --work-dir $WORK_DIR。掉点多少?排查的话,可以用 pytorch engine 跑这个 awq 模型,然后逐 layer 比对原始数据,定位问题。

AllentDan avatar Aug 15 '24 04:08 AllentDan

掉点严重

You can run an eval of gsm8k before and after quantization to see how many percentage points it specifically drops.

zhyncs avatar Aug 15 '24 07:08 zhyncs

This issue is marked as stale because it has been marked as invalid or awaiting response for 7 days without any further response. It will be closed in 5 days if the stale label is not removed or if there is no further response.

github-actions[bot] avatar Aug 23 '24 02:08 github-actions[bot]

This issue is closed because it has been stale for 5 days. Please open a new issue if you have similar issues or you have any new updates now.

github-actions[bot] avatar Aug 28 '24 02:08 github-actions[bot]