Dev Goel

Results 12 comments of Dev Goel

you can use https://github.com/npuichigo/openai_trtllm , it is a wrapper to create openai compatible api for tensorRT-LLM

Batch prompts inference How to use this in openai compatible deployment