candle icon indicating copy to clipboard operation
candle copied to clipboard

support for json (or other?) grammar?

Open kurtbuilds opened this issue 1 year ago • 2 comments
trafficstars

llama.cpp now supports grammars:

https://til.simonwillison.net/llms/llama-cpp-python-grammars

Is that something that will come to candle?

It sounds like the approach taken in this python library would be straight forward:

https://github.com/1rgs/jsonformer/blob/main/jsonformer/main.py

Basically, since you know the JSON schema, you return appropriate LLM tokens for structure based on control flow, and constrain logit output for typed value situations.

I started to work on this approach in a demo codebase... I'll report back on any progress.

Curious to hear from others about how feasible the approach is.

kurtbuilds avatar Mar 27 '24 01:03 kurtbuilds

👋 I wrote a implementation of constrained sampling with candle here that might be useful as a reference. Here are a few things I found important:

  • Parsing must be incremental if you want to get reasonable speeds for longer sequences (This makes FSM a good choice)
  • You can accelerate text generation by eagerly sampling the grammar and feeding the required next tokens into the LLM in one batch instead of one token at a time

ealmloff avatar Mar 29 '24 13:03 ealmloff

@lucasavila00 It would be great if you could implemented your model grammar work via BNF into Candle

andrewlimmer avatar Sep 17 '24 07:09 andrewlimmer