text-generation-webui generation attempts (for longer replies) clears on every generation attempt

generation attempts (for longer replies) clears on every generation attempt

Open generic-username0718 opened this issue 2 years ago • 1 comments

Describe the bug

Generation attempts clear the chat response

Is there an existing issue for this?

[X] I have searched the existing issues

Reproduction

python3 server.py --load-in-8bit --gpu-memory 16 25 --cai-chat --listen --model llama-30b --extensions llama_prompts character_bias

Cai-Chat mode. Set generation attempts to 2 Generate something.

The first generation is cleared for the 2nd

Screenshot

No response

Logs

Output generated in 1.73 seconds (0.00 tokens/s, 0 tokens)
Output generated in 11.75 seconds (2.98 tokens/s, 35 tokens)
Output generated in 222.24 seconds (0.26 tokens/s, 57 tokens)
Output generated in 45.10 seconds (3.24 tokens/s, 146 tokens)
Output generated in 40.47 seconds (3.19 tokens/s, 129 tokens)
Traceback (most recent call last):
  File "/home/user/Documents/ooba/2/text-generation-webui/modules/text_generation.py", line 239, in generate_reply
    yield formatted_outputs(reply, shared.model_name)
UnboundLocalError: local variable 'reply' referenced before assignment
Output generated in 1.73 seconds (0.00 tokens/s, 0 tokens)
Output generated in 13.77 seconds (2.98 tokens/s, 41 tokens)
Traceback (most recent call last):
  File "/home/user/Documents/ooba/2/text-generation-webui/modules/text_generation.py", line 239, in generate_reply
    yield formatted_outputs(reply, shared.model_name)
UnboundLocalError: local variable 'reply' referenced before assignment
Output generated in 1.80 seconds (0.00 tokens/s, 0 tokens)
Output generated in 16.02 seconds (3.00 tokens/s, 48 tokens)
Traceback (most recent call last):
  File "/home/user/Documents/ooba/2/text-generation-webui/modules/text_generation.py", line 239, in generate_reply
    yield formatted_outputs(reply, shared.model_name)
UnboundLocalError: local variable 'reply' referenced before assignment
Output generated in 1.74 seconds (0.00 tokens/s, 0 tokens)
Output generated in 13.74 seconds (2.91 tokens/s, 40 tokens)
Output generated in 13.91 seconds (2.88 tokens/s, 40 tokens)
Output generated in 6.90 seconds (2.90 tokens/s, 20 tokens)
Output generated in 13.68 seconds (3.51 tokens/s, 48 tokens)
Output generated in 21.84 seconds (3.71 tokens/s, 81 tokens)
Output generated in 10.52 seconds (3.33 tokens/s, 35 tokens)

System Info

5600x
DDR4 64GB 3600
2x3090w/NVLINK

Mar 24 '23 04:03 generic-username0718

See if this fixed it:

https://github.com/oobabooga/text-generation-webui/commit/4f5c2ce78560689dc8ed08a3cbb33ef15a3b4a95

Mar 24 '23 05:03 oobabooga

fixed

Mar 26 '23 06:03 generic-username0718

text-generation-webui text-generation-webui copied to clipboard

generation attempts (for longer replies) clears on every generation attempt

Describe the bug

Is there an existing issue for this?

Reproduction

Screenshot

Logs

System Info

text-generation-webui
text-generation-webui copied to clipboard