nv-guomingz comments

Results 27 comments of


                                            nv-guomingz

TypeError: weight_only_quantize() got an unexpected keyword argument 'group_size'

Our latest main branch doesn't contain build.py under examples/llama path. Are u using a legacy version code base? Please refer to[ new workflow doc](https://github.com/NVIDIA/TensorRT-LLM/blob/main/docs/source/new_workflow.md) for details with our latest code.

TypeError: weight_only_quantize() got an unexpected keyword argument 'group_size'

Please try main branch if possible since our coming release also will use new build workflow.

The issue with Llama when trying to create the: checkpoint: does not appear config.json

Could u please execute ls cmd under your ./Llama-2-7b/ path?

The issue with Llama when trying to create the: checkpoint: does not appear config.json

> g that version, but now when I try to use multiple GPUs, > I encounter this specific issue with the same m May I know the full command that...

The issue with Llama when trying to create the: checkpoint: does not appear config.json

Just wanna to double confirm that u're using 2 x NVIDIA Tesla V10016GB vRAM for running llama with tp_size 8?

V100 GPU : Facing issue with model conversion from nemo format to trt format

As the error msg said: Assertion failed: Unsupported data type, pre SM 80 GPUs do not support bfloat16 So it's the expected behavior rather than bug.

V100 GPU : Facing issue with model conversion from nemo format to trt format

You can set the dtype to fp16

V100 GPU : Facing issue with model conversion from nemo format to trt format

Another war is disable gpt attention plugin via --gpt_attention_plugin=disable

V100 GPU : Facing issue with model conversion from nemo format to trt format

this opt is applied for building engine via trtllm-build interface, if this opt was not supported by nemo script and we don't have any recommendation suggestion for enabling model conversion...

UnboundLocalError: local variable 'groupwise_qweight_safetensors' referenced before assignment

It seems there's bug here (I assume you're using main branch). A quick war is to comment line 1035 where `del groupwise_qweight_safetensors`, the rootcause is you're using pt format file...