ms-swift
ms-swift copied to clipboard
torch._C._LinAlgError: linalg.cholesky 即使讲quant_n_samples提到了2048依然报错
Describe the bug
What the bug is, and how to reproduce, better with screenshots(描述bug以及复现过程,最好有截图)
CUDA_VISIBLE_DEVICES=0 swift export --ckpt_dir 'output/qwen2-7b-instruct/v2-20240731-102444/checkpoint-30' --merge_lora true --quant_bits 4 --dataset alpaca-zh alpaca-en --quant_method gptq --quant_n_samples=2048
使用这个命令量化微调后的模型时报错
Traceback (most recent call last): File "/home/ubuntu/miniconda3/envs/qwen2_finetune/lib/python3.9/site-packages/swift/cli/export.py", line 5, in <module> export_main() File "/home/ubuntu/miniconda3/envs/qwen2_finetune/lib/python3.9/site-packages/swift/utils/run_utils.py", line 27, in x_main result = llm_x(args, **kwargs) File "/home/ubuntu/miniconda3/envs/qwen2_finetune/lib/python3.9/site-packages/swift/llm/export.py", line 192, in llm_export gptq_quantizer = gptq_model_quantize(model, template.tokenizer) File "/home/ubuntu/miniconda3/envs/qwen2_finetune/lib/python3.9/site-packages/swift/llm/export.py", line 91, in gptq_model_quantize gptq_quantizer.quantize_model(model, tokenizer) File "/home/ubuntu/miniconda3/envs/qwen2_finetune/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(*args, **kwargs) File "/home/ubuntu/miniconda3/envs/qwen2_finetune/lib/python3.9/site-packages/optimum/gptq/quantizer.py", line 518, in quantize_model scale, zero, g_idx = gptq[name].fasterquant( File "/home/ubuntu/miniconda3/envs/qwen2_finetune/lib/python3.9/site-packages/auto_gptq/quantization/gptq.py", line 116, in fasterquant H = torch.linalg.cholesky(H) torch._C._LinAlgError: linalg.cholesky: The factorization could not be completed because the input is not positive-definite (the leading minor of order 1 is not positive-definite).
查看类似issue时发现建议是升级optimum的版本以及提高quan_n_samples的数量,两个都试了依然报错。想问下该怎么解决呢?
Your hardware and system info
Write your system info like CUDA version/system/GPU/torch version here(在这里给出硬件信息和系统信息,如CUDA版本,系统,GPU型号和torch版本等)
Driver Version: 550.54.14 CUDA Version: 12.4
机器是H100
torch==2.4.0
python==3.9
transformers==4.42.4
auto_gptq==0.7.1
optimum==1.21.3
Additional context Add any other context about the problem here(在这里补充其他信息)