text-generation-webui icon indicating copy to clipboard operation
text-generation-webui copied to clipboard

Extra space at the beginning of generation

Open matatonic opened this issue 1 year ago • 2 comments

Describe the bug

An extra space is generated at the beginning of text,

It is done here: https://github.com/oobabooga/text-generation-webui/commit/15940e762e9f9a257fb8ce4f711b5e1ca7740616

This breaks a lot of use cases, like python code formatting and sequence completions.

Why is this done?

Is there an existing issue for this?

  • [X] I have searched the existing issues

Reproduction

In default mode, with max_tokens set to 2 try the following input (the next number should be 13):

0,1,1,2,3,5,8,

Instead it produces

0,1,1,2,3,5,8, 1

It should be:

0,1,1,2,3,5,8,13

Screenshot

No response

Logs

N/A

System Info

Ubuntu

matatonic avatar May 04 '23 16:05 matatonic

I noticed this last night when clicking "Continue" after generation stopped at the token limit. I'm not sure the space insertion was always there, like this may be a recent regression. It looks fine when the next token is the beginning of a new word, but not when generation ended in the middle of a word, and is continued there.

dblacknc avatar May 04 '23 18:05 dblacknc

The diff I mentioned is only from 1 week ago, and has some conditions on the last character, sound exactly like what you describe:

+        if type(shared.tokenizer) is transformers.LlamaTokenizer:
+            if len(original_question) > 0 and original_question[-1] not in [' ', '\n']:
+                reply = ' ' + reply

Based on the comment this is intended to be something llama specific...

matatonic avatar May 04 '23 18:05 matatonic