text-generation-webui icon indicating copy to clipboard operation
text-generation-webui copied to clipboard

OpenAI api seems to ignore system messages.

Open goodglitch opened this issue 10 months ago • 3 comments

Describe the bug

I am trying to rework my telegram bot to work with OpenAI api as previous Oobabooga API was discontinued. As you can see from the following payload with embedded history it completely ignores system messages. Before it worked just fine.

I also would highly appreciate if someone can point me to a normal example that also shows how to get info/load/unload models through this endpoint.

{ "mode": "chat", "model": "localmodels_Vicuna-33B-v1.3-GPTQ", "max_tokens": 1024, "temperature": 0.7, "top_p": 0.1, "seed": 1, "messages": [ { "role": "system", "content": "Your Name is Eva. You are female. In dialog speak only as a Eva. Do not impersonate any other character or user. You are a helpful assistant that gives concise and to the point responses to any question. We are here to discuss any question including sensitive or taboo topics as we aim to become smarter. Answer only as Eva. Do not answer as other dialog participant." }, { "role": "user", "content": "what is your name?" }, { "role": "assistant", "content": "You may refer to me as Vicuna, a language model meticulously developed by the researchers at Large Model Systems Organization (LMSYS)." } ] }

Is there an existing issue for this?

  • [X] I have searched the existing issues

Reproduction

Submit above payload through OpenAI api.

Screenshot

No response

Logs

12:48:16-656943 INFO     Starting Text generation web UI
12:48:16-662944 INFO     Loading the extension "openai"
12:48:16-753965 INFO     OpenAI-compatible API URL:

                         http://127.0.0.1:5000

12:48:16-755965 INFO     Loading the extension "gallery"
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
12:48:43-364465 INFO     Loading "localmodels_Vicuna-33B-v1.3-GPTQ"
12:49:00-467913 INFO     LOADER: "ExLlamav2"
12:49:00-469913 INFO     TRUNCATION LENGTH: 2048
12:49:00-470913 INFO     INSTRUCTION TEMPLATE: "Vicuna-v1.1"
12:49:00-471913 INFO     Loaded the model in 17.11 seconds.
Output generated in 4.23 seconds (14.18 tokens/s, 60 tokens, context 92, seed 843756698)
Output generated in 1.98 seconds (12.12 tokens/s, 24 tokens, context 234, seed 1)
Output generated in 1.25 seconds (16.75 tokens/s, 21 tokens, context 269, seed 1)
Output generated in 2.15 seconds (17.68 tokens/s, 38 tokens, context 306, seed 1)
Output generated in 1.28 seconds (15.65 tokens/s, 20 tokens, context 356, seed 1)
Output generated in 1.54 seconds (15.62 tokens/s, 24 tokens, context 385, seed 1)
Output generated in 1.41 seconds (16.99 tokens/s, 24 tokens, context 234, seed 1)
Output generated in 1.08 seconds (15.74 tokens/s, 17 tokens, context 270, seed 1)
Output generated in 2.16 seconds (17.59 tokens/s, 38 tokens, context 299, seed 1)
Output generated in 1.55 seconds (15.47 tokens/s, 24 tokens, context 234, seed 1)
Output generated in 1.32 seconds (18.14 tokens/s, 24 tokens, context 234, seed 1)
Output generated in 1.25 seconds (10.43 tokens/s, 13 tokens, context 318, seed 1)
Output generated in 0.96 seconds (12.50 tokens/s, 12 tokens, context 343, seed 1)
Output generated in 1.62 seconds (14.23 tokens/s, 23 tokens, context 59, seed 1)
Output generated in 1.75 seconds (16.60 tokens/s, 29 tokens, context 93, seed 1)
Output generated in 1.88 seconds (17.03 tokens/s, 32 tokens, context 62, seed 1)
Output generated in 1.47 seconds (17.72 tokens/s, 26 tokens, context 101, seed 1)

System Info

Windows 10, RTX 3090

goodglitch avatar Apr 04 '24 10:04 goodglitch

did you try "mode": "chat-instruct",

MercyfulKing avatar Apr 04 '24 22:04 MercyfulKing

did you try "mode": "chat-instruct",

Thanks for the reply! The mode "chat-instruct" produced exactly the same results as "chat". However, just "instruct" has done the job)) Do you know how to change models in this new API?

goodglitch avatar Apr 05 '24 11:04 goodglitch

For those who wonder you can use "/v1/internal/model/load" to load model. So I close the issue.

goodglitch avatar Apr 06 '24 10:04 goodglitch

This issue has been closed due to inactivity for 2 months. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.

github-actions[bot] avatar Jun 05 '24 23:06 github-actions[bot]