TensorRT-LLM
TensorRT-LLM copied to clipboard
When will support qwen1.5
python build.py --hf_model_dir /app/model/Qwen1.5-14B-Chat
--dtype float16
--remove_input_padding
--use_gemm_plugin float16
--use_gpt_attention_plugin float16
--use_inflight_batching
--max_batch_size 2
--max_input_len 2048
--max_output_len 2048
--output_dir /app/model/trt_engines/fp16/1-gpu
Traceback (most recent call last):
File "/app/tensorrt_llm/examples/qwen/build.py", line 609, in
+1
AttributeError: 'Qwen2Config' object has no attribute 'layer_norm_epsilon'
+1, waiting for
@mogoxx I think the crashing stack
File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py", line 1039, in from_pretrained
config_class = CONFIG_MAPPING[config_dict["model_type"]]
File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py", line 734, in getitem
raise KeyError(key)
is from the transformers lib. Could you try to find out if the latest transformers lib support Qwen2?
@litaotju Stactrace ends in transformers, but begins in convert. So problem might be in both places. In my case transformers==4.39.1 and TensorRT-LLM version: 0.9.0.dev2024031900 gives me the same error:
File "/workspace/TensorRT-LLM/examples/qwen/convert_checkpoint.py", line 1421, in <module>
main()
File "/workspace/TensorRT-LLM/examples/qwen/convert_checkpoint.py", line 1216, in main
args.rms_norm_eps = hf_config.layer_norm_epsilon
File "/usr/local/lib/python3.10/dist-packages/transformers/configuration_utils.py", line 263, in __getattribute__
return super().__getattribute__(key)
AttributeError: 'Qwen2Config' object has no attribute 'layer_norm_epsilon'
+1
https://github.com/Tlntin/Qwen-TensorRT-LLM It seems implement qwen2。
For TRTLLM-0.9.0, you can refer to https://github.com/Franc-Z/QWen1.5_TensorRT-LLM
@Franc-Z your repo is not public accessible
@bao21987 repo was public yesterday...
@bao21987 Confirm, and you can access.
@bao21987 Confirm, and you can access.
can you give me the access to visit it? thanks
For TRTLLM-0.9.0, you can refer to https://github.com/Franc-Z/QWen1.5_TensorRT-LLM
I also need the access
It's reopen now @pfk-beta @ArlanCooper @deutschthomas
@litaotju Stactrace ends in transformers, but begins in convert. So problem might be in both places. In my case
transformers==4.39.1andTensorRT-LLM version: 0.9.0.dev2024031900gives me the same error:File "/workspace/TensorRT-LLM/examples/qwen/convert_checkpoint.py", line 1421, in <module> main() File "/workspace/TensorRT-LLM/examples/qwen/convert_checkpoint.py", line 1216, in main args.rms_norm_eps = hf_config.layer_norm_epsilon File "/usr/local/lib/python3.10/dist-packages/transformers/configuration_utils.py", line 263, in __getattribute__ return super().__getattribute__(key) AttributeError: 'Qwen2Config' object has no attribute 'layer_norm_epsilon'
+1
https://github.com/Tlntin/Qwen-TensorRT-LLM It seems implement qwen2。
from the document, it seems actually qwen1.5 ref, https://github.com/NVIDIA/TensorRT-LLM/blob/main/examples/qwen/README.md
TensorRT-LLM 0.10.0 supports qwen2 and I have tested qwen2-7b with example/qwen (convert checkpoint and build engine)
ref https://nvidia.github.io/TensorRT-LLM/release-notes.html#id4
close this ticket since qwen 1.5 had been supported yet.