llamafile icon indicating copy to clipboard operation
llamafile copied to clipboard

Fork `openai` Python package to support llama.cpp specific features

Open tybalex opened this issue 1 year ago • 3 comments

To apply grammar to chat completion, it looks like the llamafile server is expecting the argument grammar: https://github.com/Mozilla-Ocho/llamafile/blob/main/llama.cpp/server/server.cpp#L2551

if (body.count("grammar") != 0) {
        llama_params["grammar"] = json_value(body, "grammar", json::object());
    }

However grammar is not a supported argument in the OpenAI APIs, aka, I can't do something like this :

oai_client.chat.completions.create(
            model=model,
            messages=messages,
            stream=stream,
            temperature=0.1,
            grammar=grammar,
        )

Would this ever be a supported feature in the future?

tybalex avatar Jan 17 '24 00:01 tybalex

If you're using the Python client library that's published by OpenAI, then it's not going to support features OpenAI doesn't have. However the llamafile server does support grammar. For example, here's an example of how to use grammar via the OpenAI API using curl.

curl -s http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
  "model": "gpt-3.5-turbo",
  "grammar": "root ::= [a-z]+ (\" \" [a-z]+)+",
  "messages": [
    {
      "role": "system",
      "content": "You are a poetic assistant, skilled in explaining complex programming concepts with creative flair."
    },
    {
      "role": "user",
      "content": "Compose a poem that explains the concept of recursion in programming."
    }
  ]
}' | python3 -c '
import json
import sys
json.dump(json.load(sys.stdin), sys.stdout, indent=2)
print()
'

jart avatar Jan 20 '24 07:01 jart

then it's not going to support features OpenAI doesn't have

I assume you mean Llamafile?

flatsiedatsie avatar Jan 24 '24 10:01 flatsiedatsie

No, I meant OpenAI. As far as I know, OpenAI hasn't devoted any engineering resources to adding support, in their Python client library, for features that are specific to llamafile and llama.cpp.

jart avatar Jan 24 '24 19:01 jart

You can do pip install git+ssh://[email protected]/dgarage/openai-python@202503-llama-cpp-grammar to get grammar support, but this doesn't play well with non-llama compatible endpoints atm.

kallewoof avatar Aug 12 '25 07:08 kallewoof