text-generation-webui
text-generation-webui copied to clipboard
Extra space at the beginning of generation
Describe the bug
An extra space is generated at the beginning of text,
It is done here: https://github.com/oobabooga/text-generation-webui/commit/15940e762e9f9a257fb8ce4f711b5e1ca7740616
This breaks a lot of use cases, like python code formatting and sequence completions.
Why is this done?
Is there an existing issue for this?
- [X] I have searched the existing issues
Reproduction
In default mode, with max_tokens set to 2 try the following input (the next number should be 13):
0,1,1,2,3,5,8,
Instead it produces
0,1,1,2,3,5,8, 1
It should be:
0,1,1,2,3,5,8,13
Screenshot
No response
Logs
N/A
System Info
Ubuntu
I noticed this last night when clicking "Continue" after generation stopped at the token limit. I'm not sure the space insertion was always there, like this may be a recent regression. It looks fine when the next token is the beginning of a new word, but not when generation ended in the middle of a word, and is continued there.
The diff I mentioned is only from 1 week ago, and has some conditions on the last character, sound exactly like what you describe:
+ if type(shared.tokenizer) is transformers.LlamaTokenizer:
+ if len(original_question) > 0 and original_question[-1] not in [' ', '\n']:
+ reply = ' ' + reply
Based on the comment this is intended to be something llama specific...