TensorRT-LLM icon indicating copy to clipboard operation
TensorRT-LLM copied to clipboard

How to deploy this model(Qwen/Qwen2-1.5B-Instruct-AWQ · Hugging Face.) on TensorRT-LLM?

Open shaoyanguo opened this issue 1 year ago • 0 comments

because TensorRT-LLM only support fp16 transformer models. Thank you! I advise https://github.com/NVIDIA/TensorRT-LLM/blob/main/tensorrt_llm/models/llama/convert.py#L1714,but inference result is wrong, can you help me?

shaoyanguo avatar Aug 19 '24 09:08 shaoyanguo