text-generation-webui
text-generation-webui copied to clipboard
API crashes mid response
Describe the bug
I am using ggml-vicuna-7b. This works fine in the webui. When I try to chat with it via API, it crashes on the second response. There is no error message in the oobabooga terminal. It just says 'Press any key to continue...'
I added an additional function to the script to close the session when done.
Is there an existing issue for this?
- [X] I have searched the existing issues
Reproduction
Install using windows autoinstaller.
Run the install.bat. Select B) None (I want to runin CPU mode)
choose any model
Modify start-webui.bat to change the server.py prompt as follows: python server.py --auto-devices --listen --no-stream
Run api-example.py
The first request will run and return. Sometimes even the second request will run, but the socket gets closed and oobabooga crashes after a few tries.
Screenshot

Logs
Traceback (most recent call last):
File "C:\Users\josmith\Anaconda3\lib\site-packages\urllib3\connectionpool.py", line 703, in urlopen
httplib_response = self._make_request(
File "C:\Users\josmith\Anaconda3\lib\site-packages\urllib3\connectionpool.py", line 449, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
File "C:\Users\josmith\Anaconda3\lib\site-packages\urllib3\connectionpool.py", line 444, in _make_request
httplib_response = conn.getresponse()
File "C:\Users\josmith\Anaconda3\lib\http\client.py", line 1377, in getresponse
response.begin()
File "C:\Users\josmith\Anaconda3\lib\http\client.py", line 320, in begin
version, status, reason = self._read_status()
File "C:\Users\josmith\Anaconda3\lib\http\client.py", line 281, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "C:\Users\josmith\Anaconda3\lib\socket.py", line 704, in readinto
return self._sock.recv_into(b)
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\josmith\Anaconda3\lib\site-packages\requests\adapters.py", line 439, in send
resp = conn.urlopen(
File "C:\Users\josmith\Anaconda3\lib\site-packages\urllib3\connectionpool.py", line 787, in urlopen
retries = retries.increment(
File "C:\Users\josmith\Anaconda3\lib\site-packages\urllib3\util\retry.py", line 550, in increment
raise six.reraise(type(error), error, _stacktrace)
File "C:\Users\josmith\Anaconda3\lib\site-packages\urllib3\packages\six.py", line 769, in reraise
raise value.with_traceback(tb)
File "C:\Users\josmith\Anaconda3\lib\site-packages\urllib3\connectionpool.py", line 703, in urlopen
httplib_response = self._make_request(
File "C:\Users\josmith\Anaconda3\lib\site-packages\urllib3\connectionpool.py", line 449, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
File "C:\Users\josmith\Anaconda3\lib\site-packages\urllib3\connectionpool.py", line 444, in _make_request
httplib_response = conn.getresponse()
File "C:\Users\josmith\Anaconda3\lib\http\client.py", line 1377, in getresponse
response.begin()
File "C:\Users\josmith\Anaconda3\lib\http\client.py", line 320, in begin
version, status, reason = self._read_status()
File "C:\Users\josmith\Anaconda3\lib\http\client.py", line 281, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "C:\Users\josmith\Anaconda3\lib\socket.py", line 704, in readinto
return self._sock.recv_into(b)
urllib3.exceptions.ProtocolError: ('Connection aborted.', ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "c:\Models\BabyBoogaAGI\main2.py", line 176, in <module>
new_tasks = task_creation_agent(
File "c:\Models\BabyBoogaAGI\main2.py", line 88, in task_creation_agent
new_tasks = generate_text(prompt, PARAMS).strip().split("\n")
File "c:\Models\BabyBoogaAGI\oobabooga_api.py", line 7, in generate_text
response = requests.post(f"http://{server}:7860/run/textgen", json={
File "C:\Users\josmith\Anaconda3\lib\site-packages\requests\api.py", line 119, in post
return request('post', url, data=data, json=json, **kwargs)
File "C:\Users\josmith\Anaconda3\lib\site-packages\requests\api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Users\josmith\Anaconda3\lib\site-packages\requests\sessions.py", line 542, in request
resp = self.send(prep, **send_kwargs)
File "C:\Users\josmith\Anaconda3\lib\site-packages\requests\sessions.py", line 655, in send
r = adapter.send(request, **kwargs)
File "C:\Users\josmith\Anaconda3\lib\site-packages\requests\adapters.py", line 498, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None))
This is an additional error message that came up once:
Loading facebook_galactica-1.3b...
Warning: torch.cuda.is_available() returned False.
This means that no GPU has been detected.
Falling back to CPU mode.
Loaded the model in 10.67 seconds.
C:\Models\oobabooga-windows\installer_files\env\lib\site-packages\gradio\deprecation.py:40: UserWarning: The 'type' parameter has been deprecated. Use the Number component instead.
warnings.warn(value)
Running on local URL: http://0.0.0.0:7860
To create a public link, set `share=True` in `launch()`.
C:\Models\oobabooga-windows\installer_files\env\lib\site-packages\gradio\utils.py:900: UserWarning: Starting a Matplotlib GUI outside of the main thread will likely fail.
fig = plt.figure(figsize=(0.01, 0.01))
Exception ignored in: <function Image.__del__ at 0x0000021B20300A60>
Traceback (most recent call last):
File "C:\Models\oobabooga-windows\installer_files\env\lib\tkinter\__init__.py", line 4056, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Variable.__del__ at 0x0000021B202C52D0>
Traceback (most recent call last):
File "C:\Models\oobabooga-windows\installer_files\env\lib\tkinter\__init__.py", line 388, in __del__
if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
RuntimeError: main thread is not in main loop
Exception ignored in: <function Variable.__del__ at 0x0000021B202C52D0>
Traceback (most recent call last):
File "C:\Models\oobabooga-windows\installer_files\env\lib\tkinter\__init__.py", line 388, in __del__
if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
RuntimeError: main thread is not in main loop
Exception ignored in: <function Variable.__del__ at 0x0000021B202C52D0>
Traceback (most recent call last):
File "C:\Models\oobabooga-windows\installer_files\env\lib\tkinter\__init__.py", line 388, in __del__
if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
RuntimeError: main thread is not in main loop
Exception ignored in: <function Variable.__del__ at 0x0000021B202C52D0>
Traceback (most recent call last):
File "C:\Models\oobabooga-windows\installer_files\env\lib\tkinter\__init__.py", line 388, in __del__
if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
RuntimeError: main thread is not in main loop
Exception ignored in: <function Image.__del__ at 0x0000021B20300A60>
Traceback (most recent call last):
File "C:\Models\oobabooga-windows\installer_files\env\lib\tkinter\__init__.py", line 4056, in __del__
self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Variable.__del__ at 0x0000021B202C52D0>
Traceback (most recent call last):
File "C:\Models\oobabooga-windows\installer_files\env\lib\tkinter\__init__.py", line 388, in __del__
if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
RuntimeError: main thread is not in main loop
Exception ignored in: <function Variable.__del__ at 0x0000021B202C52D0>
Traceback (most recent call last):
File "C:\Models\oobabooga-windows\installer_files\env\lib\tkinter\__init__.py", line 388, in __del__
if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
RuntimeError: main thread is not in main loop
Exception ignored in: <function Variable.__del__ at 0x0000021B202C52D0>
Traceback (most recent call last):
File "C:\Models\oobabooga-windows\installer_files\env\lib\tkinter\__init__.py", line 388, in __del__
if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
RuntimeError: main thread is not in main loop
Exception ignored in: <function Variable.__del__ at 0x0000021B202C52D0>
Traceback (most recent call last):
File "C:\Models\oobabooga-windows\installer_files\env\lib\tkinter\__init__.py", line 388, in __del__
if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
RuntimeError: main thread is not in main loop
Tcl_AsyncDelete: async handler deleted by the wrong thread
System Info
I'm using the non-gpu method on Windows. Processor is i7-1185G7. 16g RAM
Almost exactly the same. I am using ggml-vicuna-13b-4bit.bin and
python server.py --auto-devices --listen --no-stream --verbose
Also under windows but using an 3080. Similarly, the first request will run and return. It will accept the second request , but the socket gets closed , the server returns to the command line without comment and api closes with ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host, just like DataBassGit.
Almost exactly the same. I am using ggml-vicuna-13b-4bit.bin and
python server.py --auto-devices --listen --no-stream --verboseAlso under windows but using an 3080. Similarly, the first request will run and return. It will accept the second request , but the socket gets closed , the server returns to the command line without comment and api closes with
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host, just like DataBassGit.
I have the exact same problem :(
Saw the update this morning to the api. No fix.
I reran the install.bat which from what I understand should update to the most recent version.
api should be high priority to get this project off the ground tbh - if we can use our models with other programs / via api a whole new world of possibilities will be there!
Same here on GPU alpaca 13B on Linux. A work around is to watch the GPU utilization until it drops to zero before clicking Generate or Regenerate again. Also don't use the Stop function.
I think this issue has been lost in the weeds.