text-generation-webui Openassistant is replying to itself

Describe the bug

At times an OpenAssistant model will seemingly prompt and reply to itself, after answering a basic question. It's definitely in Open Assistant mode and I see the correct tags on the Character tab of the UI.

Is there an existing issue for this?

[X] I have searched the existing issues

Reproduction

Prompt: What's the cube root of 7, to two decimal places?

Response:

(It goes on for a bit after this but the first paragraph above shows the behavior.)

Screenshot

No response

Logs

Nothing unusual

System Info

Ubuntu 22.04
RTX 3060 x 2
MetaIX/OpenAssistant-Llama-30b-4bit
GPTQ - oobabooga

May 03 '23 18:05 dblacknc

Are you using instruct mode or chat mode?

May 04 '23 15:05 ClayShoaf

Instruct, with the Open Assistant template selected. To make sure I was clear - it will respond normally then seemingly randomly start responding to itself for a round or two. The first time I saw it was when asking for it to do a simple math problem. The model card says it works with this UI: https://huggingface.co/MetaIX/OpenAssistant-Llama-30b-4bit

I'm using the --true-sequential --groupsize 128 version as best I know the --act-order optimization doesn't work with the older and recommended CUDA oobabooga GPTQ I'm using.

Just loaded it up again to test. It generally seems to correctly work, then I tried a similar prompt as in the initial report - "Compute the cube root of 7.2, to two decimal places." - with this response seen:

Two rounds there. I notice in the character tab filled in by the template:

Your name: <|prompter|> Character's name: <|assistant|> Turn template: <|user|><|user-message|><|endoftext|><|bot|><|bot-message|><|endoftext|>

I don't yet understand how the turn template works and wonder if a change/update there is all that's needed.

This behavior is the same for instruct/Open Assistant or chat mode. Some following chat that didn't ask for a math computation looked normal.

May 04 '23 15:05 dblacknc

In the parameters tab, what happens if you put "<|endoftext|>" in custom stopping strings?

May 04 '23 16:05 ClayShoaf

That seems to take care of it, thanks! Good thing too, because without, it also did this:

The cube root of 123, when rounded to two decimal places, is approximately 4.97.<|endoftext|><|prompter|>Find the cube root of 0.00000000000000000000000

It was going on with zeroes and guess I would have only stopped after reaching the set token limit. BTW I'm not expecting this model to necessarily do good math - just playing with it a bit.

I added custom_stopping_strings to config-user.yaml for this model, and it's picked up as expected. Looks like I have to do that manually, as the model save that writes/updates this file does not pick up that parameter - perhaps just those on the Model tab.

May 04 '23 16:05 dblacknc

Sounds like there may be a bug with the way Open Assistant formatting is handled. Unfortunately, I won't have time to mess with it and submit a PR.

May 04 '23 17:05 ClayShoaf

I'm posting from my phone, and I don't understand it intuitively enough to say if this will solve your problem, but it looks related: https://github.com/oobabooga/text-generation-webui/commit/97a6a50d98c96554ef0ece3b18b68501a78aca01

May 04 '23 22:05 ClayShoaf

This issue has been closed due to inactivity for 30 days. If you believe it is still relevant, please leave a comment below.

Jun 03 '23 23:06 github-actions[bot]

text-generation-webui
text-generation-webui copied to clipboard

Openassistant is replying to itself - error in prompt tags?

Describe the bug

Is there an existing issue for this?

Reproduction

Screenshot

Logs

System Info

text-generation-webui text-generation-webui copied to clipboard

Openassistant is replying to itself - error in prompt tags?

Describe the bug

Is there an existing issue for this?

Reproduction

Screenshot

Logs

System Info

text-generation-webui
text-generation-webui copied to clipboard