guidance
guidance copied to clipboard
Generation of `select` fails on OpenAI chat mode, depending on possible options.
The bug
Generation of select
fails on OpenAI chat mode, depending on possible options.
It seems to be related with common prefixes between two options.
To Reproduce
This works:
llm = guidance.llms.OpenAI('gpt-3.5-turbo')
experts = guidance('''
{{#assistant~}}
{{#select 'name'}}John{{or}}Jane Doe{{/select}}
{{~/assistant}}
''', llm=llm)
x = experts()
x.variables()
But this doesn't:
llm = guidance.llms.OpenAI('gpt-3.5-turbo')
experts = guidance('''
{{#assistant~}}
{{#select 'name'}}John{{or}}John Doe{{/select}}
{{~/assistant}}
''', llm=llm)
x = experts()
x.variables()
It fails with When calling OpenAI chat models you must generate only directly inside the assistant role! The OpenAI API does not currently support partial assistant prompting.
This also fails, even though one answer is not a complete prefix of the other:
# this fails
program = guidance(
"""
{{#assistant~}}
{{#select 'help'}}No I need support{{or}}No I am not ok{{/select}}
{{~/assistant}}
""",
llm=llm
)
result = program()
System info (please complete the following information):
- OS: Ubuntu 22.04LTS
- Guidance Version:
0.0.62
(from git main branch atd6b855a
)
The problem is indeed related to prefixes.
When running the John/John Doe example, guidance extracts the common prefix \n <|im_start|>assistant\nJohn
and asks OpenAI for the next most-likely tokens, either {{
or Doe
. Because the prompt doesn't end with <|im_start|>assistant
(e.g. has \nJohn
in it it fails. However it also seems wierd that one of the next tokens to be predicted is {{
(which is the start of the finishing assistant tag) and not <|
for <|im_end|>
@jprafael Did you find a solution for this?
Thank you!
I think you may want to test this new syntax format:
{{select 'name' options=["John", "John Doe"]}}
I think you may want to test this new syntax format:
{{select 'name' options=["John", "John Doe"]}}
The issue is present regardless ot method.
@jprafael Did you find a solution for this?
I did not find a good solution for this. The issue is that to support select
guidance does the following under the scenes:
- Find the longest common denominator for all possible sequences.
- Asks the LLM what are the probabilities of the next characters, considering only the characters in the allowed options.
- Pick the option(s) where the next character is the one with the greatest probability (greedy).
- (If more than one option shares the same next character, proceed recursively)
This means that for the example above, prefix is John
and the possible next tokens should be EOS
(end-of-sentence) or
(space). However, OpenAI's API doesn't allow you to pass John
as the start of the assistant
sentence and results in the error.
The work around that I found was to create the prompt in a way that avoids prefixes, but still gives enough context to the LLM so that choices are valid:
experts = guidance('''
{{#user~}}
What is your name:
1. John
2. John Doe
{{~/user}}
{{#assistant~}}
{{#select 'name'}}1. John{{or}}2. John Doe{{/select}}
{{~/assistant}}
''', llm=llm)
This way, the longest common prefix is empty, and one of the two options can be selected directly from the first token emited by the LLM. In this case, we need to write the list of allowed values into the context, to add some meaning to the `1` and `2` tokens, but that was something I was doing already.