text-generation-webui API is broken

Describe the bug

API usage breaks every time a new parameter is added to the request body.

Is there an existing issue for this?

[X] I have searched the existing issues

Reproduction

Attempting to use --extensions api leads to no responses.

Screenshot

No response

Logs

Traceback (most recent call last):                                                                                        File "C:\Program Files\oobabooga-windows\installer_files\env\lib\site-packages\gradio\routes.py", line 393, in run_predict                                                                                                                        output = await app.get_blocks().process_api(                                                                          File "C:\Program Files\oobabooga-windows\installer_files\env\lib\site-packages\gradio\blocks.py", line 1108, in process_api                                                                                                                       result = await self.call_function(                                                                                    File "C:\Program Files\oobabooga-windows\installer_files\env\lib\site-packages\gradio\blocks.py", line 929, in call_function                                                                                                                      prediction = await anyio.to_thread.run_sync(                                                                          File "C:\Program Files\oobabooga-windows\installer_files\env\lib\site-packages\anyio\to_thread.py", line 31, in run_sync                                                                                                                          return await get_asynclib().run_sync_in_worker_thread(                                                                File "C:\Program Files\oobabooga-windows\installer_files\env\lib\site-packages\anyio\_backends\_asyncio.py", line 937, in run_sync_in_worker_thread                                                                                               return await future                                                                                                   File "C:\Program Files\oobabooga-windows\installer_files\env\lib\site-packages\anyio\_backends\_asyncio.py", line 867, in run                                                                                                                     result = context.run(func, *args)                                                                                     File "C:\Program Files\oobabooga-windows\installer_files\env\lib\site-packages\gradio\utils.py", line 490, in async_iteration                                                                                                                     return next(iterator)                                                                                                 File "C:\Program Files\oobabooga-windows\text-generation-webui\modules\api.py", line 28, in generate_reply_wrapper        for i in generate_reply(params[0], generate_params):                                                                  File "C:\Program Files\oobabooga-windows\text-generation-webui\modules\text_generation.py", line 175, in generate_reply                                                                                                                           input_ids = encode(question, add_bos_token=state['add_bos_token'], truncation_length=get_max_prompt_length(state))    File "C:\Program Files\oobabooga-windows\text-generation-webui\modules\text_generation.py", line 19, in get_max_prompt_length                                                                                                                     max_length = state['truncation_length'] - state['max_new_tokens']                                                   KeyError: 'truncation_length'

System Info

Windows 10
Nvidia Geforce RTX 2080ti

Apr 12 '23 16:04 Asais10

for me as well here is an error Traceback (most recent call last): File "C:\Users\wutaw\OneDrive\Desktop\T\installer_files\env\lib\site-packages\gradio\routes.py", line 393, in run_predict output = await app.get_blocks().process_api( File "C:\Users\wutaw\OneDrive\Desktop\T\installer_files\env\lib\site-packages\gradio\blocks.py", line 1108, in process_api result = await self.call_function( File "C:\Users\wutaw\OneDrive\Desktop\T\installer_files\env\lib\site-packages\gradio\blocks.py", line 929, in call_function prediction = await anyio.to_thread.run_sync( File "C:\Users\wutaw\OneDrive\Desktop\T\installer_files\env\lib\site-packages\anyio\to_thread.py", line 31, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "C:\Users\wutaw\OneDrive\Desktop\T\installer_files\env\lib\site-packages\anyio\_backends\_asyncio.py", line 937, in run_sync_in_worker_thread return await future File "C:\Users\wutaw\OneDrive\Desktop\T\installer_files\env\lib\site-packages\anyio\_backends\_asyncio.py", line 867, in run result = context.run(func, *args) File "C:\Users\wutaw\OneDrive\Desktop\T\installer_files\env\lib\site-packages\gradio\utils.py", line 490, in async_iteration return next(iterator) File "C:\Users\wutaw\OneDrive\Desktop\T\text-generation-webui\modules\api.py", line 25, in generate_reply_wrapper params = json.loads(string) File "C:\Users\wutaw\OneDrive\Desktop\T\installer_files\env\lib\json\__init__.py", line 346, in loads return _default_decoder.decode(s) File "C:\Users\wutaw\OneDrive\Desktop\T\installer_files\env\lib\json\decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "C:\Users\wutaw\OneDrive\Desktop\T\installer_files\env\lib\json\decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Apr 12 '23 17:04 volvalder

i think the payload format just changed a little, try updating the request to something like this:

def get_raw_reply(prompt, params):

payload = json.dumps([prompt, params])

response = requests.post(f"http://{ai_server_ip}:{ai_server_port}/run/textgen", json={
        "data": [payload]}).json()

reply = ""
if "data" in response.keys():
    reply = response["data"][0]

Apr 12 '23 18:04 zencyon

Rolling back to yesterday's versionand using SillyTavern 1.3.3 works

Apr 12 '23 18:04 Asais10

latest pull also gives me an error:

File "/home//text-generation-webui/modules/api.py", line 28, in generate_reply_wrapper for i in generate_reply(params[0], generate_params): File "/home//text-generation-webui/modules/text_generation.py", line 175, in generate_reply input_ids = encode(question, add_bos_token=state['add_bos_token'], truncation_length=get_max_prompt_length(state)) KeyError: 'add_bos_token'

guess these are new params that it needs:

'add_bos_token': True, 'custom_stopping_strings': [], 'truncation_length': 2048, 'ban_eos_token': False,

Apr 12 '23 20:04 zencyon

Let me provide more context as I rely on Textgen API usage in my project (SillyTavern). I think the problem is that adding a new parameter to the payload that breaks backward compatibility is not a good practice. Having a sensible default for new parameters would be a good idea so that existing clients won't break when something changes. A step toward the JSON body of requests was a huge leap forward compared to an array of parameters, but it didn't help to maintain the compatibility. Looking forward to some positive changes in that moving on. Thanks a lot for all the work so far.

Apr 13 '23 18:04 Cohee1207

Just started using ooba today. Couldn't figure out the API. After enabling --extensions API, the console shows me:

Loading the extension "api"... Ok. Starting KoboldAI compatible api at http://127.0.0.1:5000/api

Meanwhile the 'Use via API' link at the bottom of the page shows:

To expose an API endpoint of your app in this page, set the api_name parameter of the event listener.

^ I really don't know what that means. Where and what is the api_name / event listener?

per the api-examply.py, trying to call:

http://127.0.0.1:7860/run/textgen

with:

{ "data": [ "[\"Tell me a joke\", {\"max_new_tokens\": 200, \"do_sample\": true, \"temperature\": 0.72, \"top_p\": 0.73, \"typical_p\": 1, \"repetition_penalty\": 1.1, \"encoder_repetition_penalty\": 1.0, \"top_k\": 0, \"min_length\": 0, \"no_repeat_ngram_size\": 0, \"num_beams\": 1, \"penalty_alpha\": 0, \"length_penalty\": 1, \"early_stopping\": false, \"seed\": -1, \"add_bos_token\": true, \"custom_stopping_strings\": [], \"truncation_length\": 2048, \"ban_eos_token\": false}]" ] }

response is:

{ "detail": "Not Found" }

the endpoint: http://127.0.0.1:7860/api/v1/generate

with the body: { "prompt": "tell me a joke" }

does actually get a response:

{ "results": [ { "text": "re's a joke for you: Why did the tomato turn red? Because it saw the salad dressing!" } ] }

...but as you can first few characters appear cut off.

What am I missing/doing wrong? hopefully its not the model, the UI Chat appears just fine. Thanks!

Apr 14 '23 05:04 varch1tect

Running into the same issue as [varch1tect] over here as well

Apr 14 '23 06:04 MichaelSmithAI

I think there is a lot of confusion between the built-in gradio api that listens on 7860 and the api extension that listen to 5000, they are mentioned alternatively at random, the examples use the built-in version, they have different endpoints and it's all very confusing

Apr 16 '23 11:04 Reezlaw

Here's what helped me: https://github.com/oobabooga/text-generation-webui/issues/1114#issuecomment-1506133353

=================== I'm getting a similar problem using the built-in API and the Api tester modal that shows after clicking on the "Use via API" in the webui footer. Api is enabled in the settings:

WebUI settings

Error message I'm getting from the server.py:

File "/home/administrator/anaconda3/envs/gptq/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/home/administrator/anaconda3/envs/gptq/lib/python3.9/site-packages/gradio/utils.py", line 491, in async_iteration
    return next(iterator)
  File "/home/administrator/git/text-generation-webui/modules/api.py", line 28, in generate_reply_wrapper
    params = json.loads(string)
  File "/home/administrator/anaconda3/envs/gptq/lib/python3.9/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/home/administrator/anaconda3/envs/gptq/lib/python3.9/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/home/administrator/anaconda3/envs/gptq/lib/python3.9/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Gradio API error

The same happens if I send the request using requests from python script.

Apr 16 '23 13:04 mikolodz

Hey guys, i am getting the same error here

https://user-images.githubusercontent.com/25864917/232461693-7d34d049-cf14-45d0-bf17-80acab5aa54e.mp4

Apr 17 '23 10:04 Khyretos

Never mind im stupid, i just saw the api example and did not place the -listen --no-stream flags... my bad

Apr 17 '23 11:04 Khyretos

I just started getting #### { error: 'This app has no endpoint /api/textgen/.' } yesterday if I plug into the 7*** port and flat-out unable to connect if I try 5000/api.

Apr 24 '23 11:04 TheLustriVA

/api/textgen has been deleted and merged with 5000/api See: #990

Apr 24 '23 12:04 Cohee1207

Closing as this issue refers to an API that has been replaced

Apr 24 '23 17:04 oobabooga

I fail to understand how this issue can be resolved - I cannot use my current ooba installation as SillyTavern, on all versions or updates, cannot communicate with the API anymore. The UI reports an unknown issue, the CLI reports the mentioned AttributeError. How is this fixable?

For now I am FORCED to roll back to a previous build of the branch. Call me ignorant or stupid =) - But i don't get how to resolve this seemingly trivial problem.

Mar 22 '24 16:03 xpgx1

text-generation-webui text-generation-webui copied to clipboard

API is broken

Describe the bug

Is there an existing issue for this?

Reproduction

Screenshot

Logs

System Info

text-generation-webui
text-generation-webui copied to clipboard