Explore converting existing llama.cpp GBNF 'structured output' config to use the SOTA llguidance format (which now also powers OpenAI API JSON Schema, custom tools/etc)

Open 0xdevalias opened this issue 4 months ago • 0 comments

Interesting anecdote, llama.cpp has support for the rust-based version of guidance built in now, and is also what is used by the OpenAI API for their JSON Schemas now too:

https://github.com/guidance-ai/llguidance

Low-level Guidance (llguidance)

2025-05-20 LLGuidance shipped in OpenAI for JSON Schema

2025-02-01 integration merged into llama.cpp (b4613)

https://github.com/ggml-org/llama.cpp/pull/10224

https://github.com/ggml-org/llama.cpp/blob/master/docs/llguidance.md

LLGuidance supports JSON Schemas and arbitrary context-free grammars (CFGs) written in a variant of Lark syntax. It is very fast and has excellent JSON Schema coverage but requires the Rust compiler, which complicates the llama.cpp build process.

There are no new command-line arguments or modifications to common_params. When enabled, grammars starting with %llguidance are passed to LLGuidance instead of the current llama.cpp grammars. Additionally, JSON Schema requests (e.g., using the -j argument in llama-cli) are also passed to LLGuidance.

For your existing GBNF grammars, you can use gbnf_to_lark.py script to convert them to LLGuidance Lark-like format.

https://github.com/guidance-ai/llguidance/blob/main/docs/syntax.md

LLGuidance supports a variant of syntax used by Python Lark parsing toolkit. We also provide a gbnf_to_lark.py script to convert from GBNF format used in llama.cpp. These makes it easier to get started with a new grammar, and provide a familiar syntax, however neither is a drop-in replacement for Lark or GBNF.

Originally posted by @0xdevalias in https://github.com/jehna/humanify/issues/6#issuecomment-3218524933

See also:

https://github.com/jehna/humanify/blob/7beba2d32433e58bb77d0e1b0eda01c470fec3e2/src/plugins/local-llm-rename/gbnf.ts#L1-L73
https://github.com/jehna/humanify/blob/7beba2d32433e58bb77d0e1b0eda01c470fec3e2/src/plugins/openai/openai-rename.ts#L62-L79
https://platform.openai.com/docs/guides/function-calling#custom-tools
- https://platform.openai.com/docs/guides/function-calling#context-free-grammars
  - Context-free grammars
    
    A context-free grammar (CFG) is a set of rules that define how to produce valid text in a given format. For custom tools, you can provide a CFG that will constrain the model's text input for a custom tool.
    
    You can provide a custom CFG using the grammar parameter when configuring a custom tool. Currently, we support two CFG syntaxes when defining grammars: lark and regex.
  - Grammars are specified using a variation of Lark. Model sampling is constrained using LLGuidance.
  - We recommend using the Lark IDE to experiment with custom grammars.
https://guidance-ai.github.io/llguidance/llg-go-brrr
- LLGuidance: Making Structured Outputs Go Brrr
https://github.com/guidance-ai/guidance-ts

Aug 25 '25 01:08 0xdevalias