Progressive execution of choice over large number of options
Hi! I am not sure I understand how outlines.choice works currently, and whether this feature technically feasable to be implemented, but here is my usecase:
I have a constrained dataset of 1000 exact answers that I want my LLM to choose from in response to any user prompt.
If it's possible, I want my LLM to proceed generating each next token with different list of positive logit_bias as corresponding to shrinking list of remaining options after each next token choice.
Example. Let's say I have a dataset of 4 answers only: Good, Bad, Very Good, Very Bad User Prompt is: How are you?
In first sampling step, I want sampler to choose one of 3 tokens corresponding to: good, bad, very Let's say first token was "very" then next token can be only either good or bad, and then generation completed.
That's an interesting use case. You could build the string-based FSM really easily by hand, and you could probably write a regex for it but that could end up being quite complicated.
We could always write a function that takes a string-based FSM, turns it into a character-based FSM that is then compiled into the token-based FSM that is used when generating text.
Following testing script demonstrates that dynamically generated regexp actually works
import outlines
import time
model = outlines.models.transformers("mistralai/Mistral-7B-Instruct-v0.2")
prompt = """
Respond User Question.
Q: How are you?
"""
n = 10
moods = ["very", "slightly"]
states = ["good", "bad"]
regex_pattern = "|".join([f"My Mood is {mood} {state} ({i} points of {n})" for mood in moods for state in states for i in range(1, n)])
generator = outlines.generate.regex(
model,
regex_pattern,
)
start_time = time.time() # Start time
answer = generator(prompt)
end_time = time.time() # End time
print(answer)
print(f"Time taken for generation: {end_time - start_time} seconds")
outputs
My Mood is very good 1 points of 10
Time taken for generation: 49.05198383331299 seconds
but if I increase n=500 generation time is much longer
now let's say I want to optimize it by leveraging my knowledge that "very bad" mood can only belong to range of 1-5
so once sampler has generated "very bad" token I want to adjust my regexp by removing all options other than mentioning 1-5 points
you say it can be done by constructing custom FSM? can it have a handler which accepts current generation text and returns a new list of options for next token? that would be ideal, but again I'm not sure if it's possible? @rlouf