shaoyanguo
Results
1
issues of
shaoyanguo
because TensorRT-LLM only support fp16 transformer models. Thank you! I advise https://github.com/NVIDIA/TensorRT-LLM/blob/main/tensorrt_llm/models/llama/convert.py#L1714,but inference result is wrong, can you help me?