llamafile
llamafile copied to clipboard
Fork `openai` Python package to support llama.cpp specific features
To apply grammar to chat completion, it looks like the llamafile server is expecting the argument grammar: https://github.com/Mozilla-Ocho/llamafile/blob/main/llama.cpp/server/server.cpp#L2551
if (body.count("grammar") != 0) {
llama_params["grammar"] = json_value(body, "grammar", json::object());
}
However grammar is not a supported argument in the OpenAI APIs, aka, I can't do something like this :
oai_client.chat.completions.create(
model=model,
messages=messages,
stream=stream,
temperature=0.1,
grammar=grammar,
)
Would this ever be a supported feature in the future?
If you're using the Python client library that's published by OpenAI, then it's not going to support features OpenAI doesn't have. However the llamafile server does support grammar. For example, here's an example of how to use grammar via the OpenAI API using curl.
curl -s http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
"model": "gpt-3.5-turbo",
"grammar": "root ::= [a-z]+ (\" \" [a-z]+)+",
"messages": [
{
"role": "system",
"content": "You are a poetic assistant, skilled in explaining complex programming concepts with creative flair."
},
{
"role": "user",
"content": "Compose a poem that explains the concept of recursion in programming."
}
]
}' | python3 -c '
import json
import sys
json.dump(json.load(sys.stdin), sys.stdout, indent=2)
print()
'
then it's not going to support features OpenAI doesn't have
I assume you mean Llamafile?
No, I meant OpenAI. As far as I know, OpenAI hasn't devoted any engineering resources to adding support, in their Python client library, for features that are specific to llamafile and llama.cpp.
You can do pip install git+ssh://[email protected]/dgarage/openai-python@202503-llama-cpp-grammar to get grammar support, but this doesn't play well with non-llama compatible endpoints atm.