TensorRT-LLM When will support qwen1.5

python build.py --hf_model_dir /app/model/Qwen1.5-14B-Chat
--dtype float16
--remove_input_padding
--use_gemm_plugin float16
--use_gpt_attention_plugin float16
--use_inflight_batching
--max_batch_size 2
--max_input_len 2048
--max_output_len 2048
--output_dir /app/model/trt_engines/fp16/1-gpu

Traceback (most recent call last): File "/app/tensorrt_llm/examples/qwen/build.py", line 609, in args = parse_arguments() File "/app/tensorrt_llm/examples/qwen/build.py", line 356, in parse_arguments hf_config = AutoConfig.from_pretrained( File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py", line 1039, in from_pretrained config_class = CONFIG_MAPPING[config_dict["model_type"]] File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py", line 734, in getitem raise KeyError(key) KeyError: 'qwen2'

Mar 01 '24 09:03 mogoxx

+1

Mar 08 '24 08:03 whk6688

AttributeError: 'Qwen2Config' object has no attribute 'layer_norm_epsilon'

Mar 08 '24 08:03 whk6688

+1, waiting for

Mar 12 '24 08:03 ArlanCooper

@mogoxx I think the crashing stack

File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py", line 1039, in from_pretrained
config_class = CONFIG_MAPPING[config_dict["model_type"]]
File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py", line 734, in getitem
raise KeyError(key)

is from the transformers lib. Could you try to find out if the latest transformers lib support Qwen2?

Mar 22 '24 14:03 litaotju

@litaotju Stactrace ends in transformers, but begins in convert. So problem might be in both places. In my case transformers==4.39.1 and TensorRT-LLM version: 0.9.0.dev2024031900 gives me the same error:

  File "/workspace/TensorRT-LLM/examples/qwen/convert_checkpoint.py", line 1421, in <module>
    main()
  File "/workspace/TensorRT-LLM/examples/qwen/convert_checkpoint.py", line 1216, in main
    args.rms_norm_eps = hf_config.layer_norm_epsilon
  File "/usr/local/lib/python3.10/dist-packages/transformers/configuration_utils.py", line 263, in __getattribute__
    return super().__getattribute__(key)
AttributeError: 'Qwen2Config' object has no attribute 'layer_norm_epsilon'

Mar 26 '24 13:03 pfk-beta

+1

Apr 03 '24 05:04 bluejadezhou

https://github.com/Tlntin/Qwen-TensorRT-LLM It seems implement qwen2。

Apr 11 '24 09:04 shiqingzhangCSU

For TRTLLM-0.9.0, you can refer to https://github.com/Franc-Z/QWen1.5_TensorRT-LLM

Apr 14 '24 13:04 Franc-Z

@Franc-Z your repo is not public accessible

Apr 17 '24 02:04 bao21987

@bao21987 repo was public yesterday...

Apr 17 '24 07:04 pfk-beta

@bao21987 Confirm, and you can access.

Apr 17 '24 09:04 Franc-Z

@bao21987 Confirm, and you can access.

can you give me the access to visit it? thanks

Apr 17 '24 09:04 ArlanCooper

For TRTLLM-0.9.0, you can refer to https://github.com/Franc-Z/QWen1.5_TensorRT-LLM

I also need the access

Apr 17 '24 16:04 deutschthomas

It's reopen now @pfk-beta @ArlanCooper @deutschthomas

Apr 17 '24 21:04 Franc-Z

@litaotju Stactrace ends in transformers, but begins in convert. So problem might be in both places. In my case transformers==4.39.1 and TensorRT-LLM version: 0.9.0.dev2024031900 gives me the same error:

  File "/workspace/TensorRT-LLM/examples/qwen/convert_checkpoint.py", line 1421, in <module>
    main()
  File "/workspace/TensorRT-LLM/examples/qwen/convert_checkpoint.py", line 1216, in main
    args.rms_norm_eps = hf_config.layer_norm_epsilon
  File "/usr/local/lib/python3.10/dist-packages/transformers/configuration_utils.py", line 263, in __getattribute__
    return super().__getattribute__(key)
AttributeError: 'Qwen2Config' object has no attribute 'layer_norm_epsilon'

+1

May 31 '24 10:05 Liu-Da

https://github.com/Tlntin/Qwen-TensorRT-LLM It seems implement qwen2。

from the document, it seems actually qwen1.5 ref, https://github.com/NVIDIA/TensorRT-LLM/blob/main/examples/qwen/README.md

Jun 13 '24 03:06 riverind

TensorRT-LLM 0.10.0 supports qwen2 and I have tested qwen2-7b with example/qwen (convert checkpoint and build engine) ref https://nvidia.github.io/TensorRT-LLM/release-notes.html#id4

Jul 25 '24 06:07 effective-qdd

close this ticket since qwen 1.5 had been supported yet.

Nov 14 '24 07:11 nv-guomingz

TensorRT-LLM TensorRT-LLM copied to clipboard

When will support qwen1.5

TensorRT-LLM
TensorRT-LLM copied to clipboard