trt-llm-rag-windows icon indicating copy to clipboard operation
trt-llm-rag-windows copied to clipboard

AttributeError: 'WeightOnlyGroupwiseQuantLinear' object has no attribute 'prequant_scaling_factor'

Open drewzeee opened this issue 1 year ago • 1 comments

Receiving the above error when attempting to build the TRT engine.

Using a 3090 with driver 546.33, CUDA 12.3 and tensorrt_llm-0.7.1

Traceback (most recent call last): File "Z:\oracle\model\TensorRT-LLM\examples\llama\build.py", line 983, in build(0, args) File "Z:\oracle\model\TensorRT-LLM\examples\llama\build.py", line 927, in build engine = build_rank_engine(builder, builder_config, engine_name, File "Z:\oracle\model\TensorRT-LLM\examples\llama\build.py", line 727, in build_rank_engine load_from_awq_llama(tensorrt_llm_llama=tensorrt_llm_llama, File "C:\Users\andrew\anaconda3\envs\test\lib\site-packages\tensorrt_llm\models\llama\weight.py", line 1564, in load_from_awq_llama process_and_assign_qkv_weight(prefix + awq_key_list[3], File "C:\Users\andrew\anaconda3\envs\test\lib\site-packages\tensorrt_llm\models\llama\weight.py", line 1511, in process_and_assign_qkv_weight mOp.prequant_scaling_factor.value = qkv_pre_quant_scale.to( File "C:\Users\andrew\anaconda3\envs\test\lib\site-packages\tensorrt_llm\module.py", line 51, in getattr raise AttributeError("'{}' object has no attribute '{}'".format( AttributeError: 'WeightOnlyGroupwiseQuantLinear' object has no attribute 'prequant_scaling_factor'

drewzeee avatar Jan 12 '24 20:01 drewzeee

Hello, can you please share the build.py command used for engine generation?

kedarpotdar-nv avatar Feb 13 '24 23:02 kedarpotdar-nv

we just release a updated version 0.3 . Please use that branch and follow readme: https://github.com/NVIDIA/ChatRTX/blob/release/0.3/README.md to setup the application

anujj avatar May 23 '24 09:05 anujj