TensorRT-LLM icon indicating copy to clipboard operation
TensorRT-LLM copied to clipboard

Provide an interface similar to OpenAI API

Open Pevernow opened this issue 2 years ago • 11 comments

Could you please provide a simple interface similar to OpenAI API?

Pevernow avatar Nov 09 '23 11:11 Pevernow

@Pevernow Can you elaborate more about your request? Thanks June

juney-nvidia avatar Nov 09 '23 14:11 juney-nvidia

image The most basic thing is just a simple chat/completions function similar to OpenAI.

The purpose is to facilitate access to existing applications that use the chatgpt API.

Currently, many open-source LLM projects have implemented this feature, such as the famous oobabooga/text generation webui

Pevernow avatar Nov 10 '23 13:11 Pevernow

Users want something like this https://github.com/lm-sys/FastChat/blob/main/docs/openai_api.md, so they can switch their apps from OpenAI models to TRT-LLM models easily without code change.

merrymercy avatar Nov 11 '23 09:11 merrymercy

@juney-nvidia Please take a look here, thank you

Pevernow avatar Nov 12 '23 04:11 Pevernow

Right now Python API still have a lot of issues to be fixed, I encapsulated one OpenAI API, but met #283, so you still need use C++ runtime, which means you need Triton. Spent weeks on TRT-LLM, it difficult to develop on python runtime.

@juney-nvidia What's the position of TRT-LLM's Python runime? I mean, python is easier than C++, and batch manager doesn't open source right now. Most developors may not use Triton since they won't meet that large commercial demand.

gesanqiu avatar Nov 15 '23 07:11 gesanqiu

mark

chrjxj avatar Nov 27 '23 09:11 chrjxj

Sorry for replying late due to being trapped by other things.

Users want something like this https://github.com/lm-sys/FastChat/blob/main/docs/openai_api.md, so they can switch their apps from OpenAI models to TRT-LLM models easily without code change.

@Pevernow @merrymercy well received, will discuss with prod about this. @ncomly-nvidia for vis.

What's the position of TRT-LLM's Python runime? I mean, python is easier than C++, and batch manager doesn't open source right now. Most developors may not use Triton since they won't meet that large commercial demand.

@gesanqiu we have already released the Python binding of C++ runtime, including batch manager, does this fulfill your requirement here?

juney-nvidia avatar Dec 11 '23 14:12 juney-nvidia

vote up, monitoring

binarycrayon avatar Feb 14 '24 18:02 binarycrayon

More info about openai chat completion API spec here https://github.com/openai/openai-openapi/tree/master

binarycrayon avatar Feb 14 '24 18:02 binarycrayon

i need it too

whk6688 avatar Mar 12 '24 03:03 whk6688

+1 for OpenAI API support

LMarino1 avatar May 21 '24 20:05 LMarino1

+1 for OpenAI API support

Mary-Sam avatar Jun 08 '24 16:06 Mary-Sam

+1 for OpenAI API support, it's been 9 months since the pr requested :(

mynameiskeen avatar Jul 10 '24 02:07 mynameiskeen

+1 OpenApi support

palindsay avatar Jul 20 '24 05:07 palindsay

+1

nstl-zyb avatar Jul 30 '24 08:07 nstl-zyb

+1 for openai api support。

wertyac avatar Aug 01 '24 07:08 wertyac