llama.cpp
llama.cpp copied to clipboard
Add an API example using server.cpp similar to OAI.
adding an API example that provides responses similar to OpenAI's chat completion and completion. This example is about 30% faster than existing similar examples because they are based on llama-cpp-python, which is slightly slower than llama.cpp. This example must be used with server.cpp.
Run this code, and write like this in python:
openai.api_base = "http://***.***.***.***:8081"
Then almost all OpenAI api code is compatible with llama.cpp.