lmql
lmql copied to clipboard
[Question] Support constraining via context free grammar for dsls
Hi @lbeurerkellner,
Do you have any plans to "natively" integrate token constraint into the lmql language, perhaps through ATLR/Lark/ENBF grammar notation? This is a feature currently supported by guidance (https://github.com/guidance-ai/guidance?tab=readme-ov-file#context-free-grammars) and outlined in examples from other projects like outlines (https://github.com/outlines-dev/outlines?tab=readme-ov-file#using-context-free-grammars-to-guide-generation).
We have some plans, but there is no concrete ETA currently. I will keep the issue to track the status.
Do you have any concrete use cases in mind?
Do you have any concrete use cases in mind?
Yes, I do. I recently was playing around with a DSL for HMI testing within the automotive domain. Using guidance my script look like the following:
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
import guidance
from guidance import models, system, user, assistant, gen, select, one_or_more
checkpoint = "HuggingFaceH4/zephyr-7b-beta"
lm = models.TransformersChat(checkpoint, device_map="auto", torch_dtype=torch.bfloat16)
@guidance(stateless=True)
def enter_teststep(lm):
return lm + "Enter " + gen(stop="into", max_tokens=20) + " into the " + gen(max_tokens=3) + "."
@guidance(stateless=True)
def tap_teststep(lm):
return lm + "Tap the " + gen(stop="button", max_tokens=3) + " button"
@guidance(stateless=True)
def modification_teststep(lm):
return lm + select(["Activate", "Deactivate"]) + " the " + gen(stop="option", max_tokens=3) + " option"
@guidance(stateless=True)
def slider_modification_teststep(lm):
return lm + select(["Increase", "Decrease"]) + " the " + select(["vertical", "horizontal"]) + gen(stop="slider", max_tokens=3) + " slider by " + one_or_more(select(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'])) + gen(max_tokens=2)
@guidance(stateless=True)
def selection_teststep(lm):
return lm + "Select " + gen(stop="from", max_tokens=5) + " from the list"
@guidance(stateless=True)
def teststep(lm):
return lm + select([enter_teststep(), tap_teststep(), modification_teststep(), slider_modification_teststep(), selection_teststep()])
system_msg = "You are an expert in software testing within the automotive infotainment domain. Additionally, your understanding of the ANTLR grammar is enormous."
prompt = """Antlr Grammar for Test Case Description DSL in the Infotainment Domain:
grammar TestCase;
testcase: 'Testcase:' ID
'Preconditions:' teststeps*
'Actions:' teststeps+
'Postconditions:' teststeps*;
teststeps: enterStep
| tapStep
| modificationStep
| sliderModificationStep
| selectStep;
enterStep: 'Enter the' OBJECTNAME 'into the' TARGETNAME;
tapStep: 'Tap the' OBJECTNAME 'button';
modificationStep: (Activate | Deactivate) 'the' OBJECTNAME 'option';
sliderModificationStep: (Increase | Decrease) 'the' ORIENTATION OBJECTNAME 'slider by' NUMBER UNITS;
selectStep: 'Select' OBJECTNAME 'from the list';
OBJECTNAME: ID;
TARGETNAME: ID;
ORIENTATION: 'vertical' | 'horizontal';
UNITS: ID;
NUMBER: DIGIT+;
Activate: 'Activate';
Deactivate: 'Deactivate';
Increase: 'Increase';
Decrease: 'Decrease';
ID: [a-zA-Z]+;
DIGIT: [0-9];
User Test Case:
Testcase: Check bass slider from -10 to 10
Preconditions:
Tap the settings button
Tap the tone settings button
Predict the next logical test step using the grammar rules.
The next logical test step is the following action:
-
"""
with system():
llm = lm + system_msg
with user():
llm += prompt
with assistant():
llm += teststep()
resulting in the following response: