Yaml Grammar
The LinkedIn Engineering Team recently wrote about their experience implementing tool use with LLMs. They explain that they structure every tool call in YAML, not JSON because YAML requires fewer tokens:
Since the parameters to the call have to match the input schema, we ask the LLM to output them in a structured manner. Most LLMs are trained on YAML and JSON for structured output. We picked YAML because it is less verbose, and hence consumes fewer tokens than JSON.
This seems to be a very important reason to support structured YAML generation.
I found the grammar for YAML here and would love for Outlines to implement this.
Thoughts?
This should be possible with CFG-structured generation. There's a lark grammar for YAML here, but it might need to be changed in order to properly support any parser restrictions the current CFG-structured generation may have (e.g. LALR(1) only).
In the meantime, we can leave this open for people to try and report any issues or necessary changes.
Duplicate of #923