TensorRT-LLM UnboundLocalError: local variable 'groupwise_qweight_safetensors' referenced before assignment

UnboundLocalError: local variable 'groupwise_qweight_safetensors' referenced before assignment

Open Minami-su opened this issue 1 year ago • 1 comments

python build.py --hf_model_dir Qwen-7B-Chat \
>                 --quant_ckpt_path ./qwen_7b_4bit_gs128_awq.pt \
>                 --dtype float16 \
>                 --remove_input_padding \
>                 --use_gpt_attention_plugin float16 \
>                 --enable_context_fmha \
>                 --use_gemm_plugin float16 \
>                 --use_weight_only \
>                 --weight_only_precision int4_awq \
>                 --per_group \
>                 --world_size 1 \
>                 --tp_size 1 \
>                 --output_dir ./tmp/Qwen/7B/trt_engines/int4-awq/1-gpu

[TensorRT-LLM] TensorRT-LLM version: 0.9.0.dev2024020600[02/08/2024-14:19:14] [TRT-LLM] [I] Serially build TensorRT engines.
[02/08/2024-14:19:14] [TRT] [I] [MemUsageChange] Init CUDA: CPU +14, GPU +0, now: CPU 119, GPU 1256 (MiB)
[02/08/2024-14:19:16] [TRT] [I] [MemUsageChange] Init builder kernel library: CPU +1800, GPU +312, now: CPU 2055, GPU 1568 (MiB)
[02/08/2024-14:19:16] [TRT-LLM] [W] Invalid timing cache, using freshly created one
[02/08/2024-14:19:16] [TRT-LLM] [I] Loading weights from groupwise AWQ Qwen safetensors...
Loading weights...: 100%|███████████████████████████████████████████████████████████████| 32/32 [00:45<00:00,  1.43s/it]
[02/08/2024-14:20:07] [TRT-LLM] [I] Weights loaded. Total time: 00:00:51
Traceback (most recent call last):
  File "/app/tensorrt_llm/examples/qwen/build.py", line 705, in <module>
    build(0, args)
  File "/app/tensorrt_llm/examples/qwen/build.py", line 675, in build
    engine = build_rank_engine(builder, builder_config, engine_name,
  File "/app/tensorrt_llm/examples/qwen/build.py", line 498, in build_rank_engine
    load_from_awq_qwen(tensorrt_llm_qwen=tensorrt_llm_qwen,
  File "/app/tensorrt_llm/examples/qwen/weight.py", line 1035, in load_from_awq_qwen
    del groupwise_qweight_safetensors
UnboundLocalError: local variable 'groupwise_qweight_safetensors' referenced before assignment

Feb 08 '24 14:02 Minami-su

It seems there's bug here (I assume you're using main branch).

A quick war is to comment line 1035 where del groupwise_qweight_safetensors, the rootcause is you're using pt format file which make groupwise_qweight_safetensors was not assigned yet.

Feb 08 '24 14:02 nv-guomingz

TensorRT-LLM TensorRT-LLM copied to clipboard

UnboundLocalError: local variable 'groupwise_qweight_safetensors' referenced before assignment

TensorRT-LLM
TensorRT-LLM copied to clipboard