text-generation-webui OpenAI api seems to ignore system messages.

Describe the bug

I am trying to rework my telegram bot to work with OpenAI api as previous Oobabooga API was discontinued. As you can see from the following payload with embedded history it completely ignores system messages. Before it worked just fine.

I also would highly appreciate if someone can point me to a normal example that also shows how to get info/load/unload models through this endpoint.

{ "mode": "chat", "model": "localmodels_Vicuna-33B-v1.3-GPTQ", "max_tokens": 1024, "temperature": 0.7, "top_p": 0.1, "seed": 1, "messages": [ { "role": "system", "content": "Your Name is Eva. You are female. In dialog speak only as a Eva. Do not impersonate any other character or user. You are a helpful assistant that gives concise and to the point responses to any question. We are here to discuss any question including sensitive or taboo topics as we aim to become smarter. Answer only as Eva. Do not answer as other dialog participant." }, { "role": "user", "content": "what is your name?" }, { "role": "assistant", "content": "You may refer to me as Vicuna, a language model meticulously developed by the researchers at Large Model Systems Organization (LMSYS)." } ] }

Is there an existing issue for this?

[X] I have searched the existing issues

Reproduction

Submit above payload through OpenAI api.

Screenshot

No response

Logs

12:48:16-656943 INFO     Starting Text generation web UI
12:48:16-662944 INFO     Loading the extension "openai"
12:48:16-753965 INFO     OpenAI-compatible API URL:

                         http://127.0.0.1:5000

12:48:16-755965 INFO     Loading the extension "gallery"
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
12:48:43-364465 INFO     Loading "localmodels_Vicuna-33B-v1.3-GPTQ"
12:49:00-467913 INFO     LOADER: "ExLlamav2"
12:49:00-469913 INFO     TRUNCATION LENGTH: 2048
12:49:00-470913 INFO     INSTRUCTION TEMPLATE: "Vicuna-v1.1"
12:49:00-471913 INFO     Loaded the model in 17.11 seconds.
Output generated in 4.23 seconds (14.18 tokens/s, 60 tokens, context 92, seed 843756698)
Output generated in 1.98 seconds (12.12 tokens/s, 24 tokens, context 234, seed 1)
Output generated in 1.25 seconds (16.75 tokens/s, 21 tokens, context 269, seed 1)
Output generated in 2.15 seconds (17.68 tokens/s, 38 tokens, context 306, seed 1)
Output generated in 1.28 seconds (15.65 tokens/s, 20 tokens, context 356, seed 1)
Output generated in 1.54 seconds (15.62 tokens/s, 24 tokens, context 385, seed 1)
Output generated in 1.41 seconds (16.99 tokens/s, 24 tokens, context 234, seed 1)
Output generated in 1.08 seconds (15.74 tokens/s, 17 tokens, context 270, seed 1)
Output generated in 2.16 seconds (17.59 tokens/s, 38 tokens, context 299, seed 1)
Output generated in 1.55 seconds (15.47 tokens/s, 24 tokens, context 234, seed 1)
Output generated in 1.32 seconds (18.14 tokens/s, 24 tokens, context 234, seed 1)
Output generated in 1.25 seconds (10.43 tokens/s, 13 tokens, context 318, seed 1)
Output generated in 0.96 seconds (12.50 tokens/s, 12 tokens, context 343, seed 1)
Output generated in 1.62 seconds (14.23 tokens/s, 23 tokens, context 59, seed 1)
Output generated in 1.75 seconds (16.60 tokens/s, 29 tokens, context 93, seed 1)
Output generated in 1.88 seconds (17.03 tokens/s, 32 tokens, context 62, seed 1)
Output generated in 1.47 seconds (17.72 tokens/s, 26 tokens, context 101, seed 1)

System Info

Windows 10, RTX 3090

Apr 04 '24 10:04 goodglitch

did you try "mode": "chat-instruct",

Apr 04 '24 22:04 MercyfulKing

did you try "mode": "chat-instruct",

Thanks for the reply! The mode "chat-instruct" produced exactly the same results as "chat". However, just "instruct" has done the job)) Do you know how to change models in this new API?

Apr 05 '24 11:04 goodglitch

For those who wonder you can use "/v1/internal/model/load" to load model. So I close the issue.

Apr 06 '24 10:04 goodglitch

This issue has been closed due to inactivity for 2 months. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.

Jun 05 '24 23:06 github-actions[bot]

text-generation-webui text-generation-webui copied to clipboard

OpenAI api seems to ignore system messages.

Describe the bug

Is there an existing issue for this?

Reproduction

Screenshot

Logs

System Info

text-generation-webui
text-generation-webui copied to clipboard