Last Token Sometimes Missing from Output in Guidance 0.2.0
The Bug
When running a simple Guidance script with gen(), the last token generated by the LLM is sometimes missing. This behavior is inconsistent; sometimes, the full response appears, while at other times, the final token is lost.
To Reproduce
from guidance import models, gen, system, user, assistant
llm = models.OpenAI('gpt-3.5-turbo')
with system():
llm += "You are an LLM. Please follow all instructions."
with user():
llm += "Please reply to this message with FINISHED and nothing else."
with assistant():
llm += gen(name='reply')
print(llm['reply'])
Expected Behavior
The output should always be FINISHED.
Observed Behavior
Sometimes, the output is truncated to FIN, and the last token (ISHED, which is a separate token) is missing.
This also works for other examples, I just used this one because it should be relatively clear the model is very unlikely to just return FIN and that indeed something goes missing.
Additional Notes
- This issue appears inconsistently: repeated runs of the same script may or may not trigger it.
- It occurs with guidance==0.2.0.
- The problem does not appear in guidance==0.1.16.
- The problem occurs not only with
gpt-3.5-turbobut also other GPT 4 variants I tried.
System info:
- OS: Fedora
- Guidance Version: 0.2.0
@hudson-ai @mmoskal any ideas here?
I can reproduce this behavior with gpt-4o-mini too.
It has been a month since I opened this issue, and another user has also reported experiencing it. I imagine that this could impact a lot more users without them noticing. Please let me know if you need any more details to help investigate this. Thanks!
Is it possible this issue is limited to OpenAI models? I cannot reproduce this using Phi 3.5.
from guidance import models, gen, system, user, assistant
from guidance.chat import Phi3MiniChatTemplate
if __name__ == "__main__":
llm = models.Transformers(
"microsoft/Phi-3.5-mini-instruct",
chat_template=Phi3MiniChatTemplate,
)
with system():
llm += "You are an LLM. Please follow all instructions."
with user():
llm += "Please reply to this message with FINISHED and nothing else."
with assistant():
llm += gen(name='reply')
print(llm['reply'])
I ran this 10 times in a row on Guidance 3918b36c05f76215c9b061c5ee7398e975d26f78 and always got FINISHED (with the leading space) as a response.
I just ran some tests with 0.2.1 (from PyPi) and the problem seems to be fixed. However, the latest release for this repo (0.2.0) still has the problem.
Yes, the 0.2.0 release is broken in various ways, unfortunately, and we were promised a new release back in March. Perhaps the team is waiting to finish some ongoing work before pulling the trigger on that new release.