opencompass [Feature] Add support for TensorRT-LLM inference engine

[Feature] Add support for TensorRT-LLM inference engine

Open starlitsky2010 opened this issue 1 year ago • 0 comments

Describe the feature

Hi guys,

The TensorRT-LLM has been released last week. It was maintained by NVIDIA with high inference performance. Link: https://github.com/NVIDIA/TensorRT-LLM

Will implement it by API calling or just integrate it into inference pipeline just like huggingface inference method? Which method is better?

Thanks

Will you implement it?

[ ] I would like to implement this feature and create a PR!

Oct 26 '23 03:10 starlitsky2010

opencompass opencompass copied to clipboard

[Feature] Add support for TensorRT-LLM inference engine

Describe the feature

Will you implement it?

opencompass
opencompass copied to clipboard