llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

Add an API example using server.cpp similar to OAI.

Open jwj7140 opened this issue 1 year ago • 0 comments

adding an API example that provides responses similar to OpenAI's chat completion and completion. This example is about 30% faster than existing similar examples because they are based on llama-cpp-python, which is slightly slower than llama.cpp. This example must be used with server.cpp.

Run this code, and write like this in python: openai.api_base = "http://***.***.***.***:8081" Then almost all OpenAI api code is compatible with llama.cpp.

jwj7140 avatar Jun 26 '23 17:06 jwj7140