FastChat NETWORK ERROR DUE TO HIGH TRAFFIC. PLEASE REGENERATE OR REFRESH THIS PAGE.

trafficstars

Kubuntu 22.10, Nvidia driver Version: 530.30.02, CUDA Version: 12.1, RTX 3080 12G

When the gradio webpage is left open for a few seconds after the response, the bot does not respond anymore until everything is restarted again. it seems like it's a worker bug. Here's the worker log.

full worker log:

python3 -m fastchat.serve.model_worker --model-path /home/kubuntu/Documenti/vicuna --load-8bit
2023-04-08 03:22:40 | INFO | model_worker | args: Namespace(host='localhost', port=21002, worker_address='http://localhost:21002', controller_address='http://localhost:21001', model_path='/home/kubuntu/Documenti/vicuna', model_name=None, device='cuda', num_gpus=1, load_8bit=True, limit_model_concurrency=5, stream_interval=2, no_register=False)
2023-04-08 03:22:40 | INFO | model_worker | Loading the model vicuna on worker 33aad6 ...
2023-04-08 03:22:40 | ERROR | stderr | /home/kubuntu/.local/lib/python3.10/site-packages/bitsandbytes/cuda_setup/paths.py:98: UserWarning: /home/kubuntu/anaconda3 did not contain libcudart.so as expected! Searching further paths...
2023-04-08 03:22:40 | ERROR | stderr |   warn(
2023-04-08 03:22:40 | ERROR | stderr | WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('local/kubuntu'), PosixPath('@/tmp/.ICE-unix/1524,unix/kubuntu')}
2023-04-08 03:22:40 | ERROR | stderr | WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/etc/xdg/xdg-plasma')}
2023-04-08 03:22:40 | ERROR | stderr | WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/org/freedesktop/DisplayManager/Session0')}
2023-04-08 03:22:40 | ERROR | stderr | WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/etc/gtk/gtkrc'), PosixPath('/home/kubuntu/.gtkrc')}
2023-04-08 03:22:40 | ERROR | stderr | WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('0'), PosixPath('1')}
2023-04-08 03:22:40 | ERROR | stderr | WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/etc/gtk-2.0/gtkrc')}
2023-04-08 03:22:40 | ERROR | stderr | WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/Sessions/1')}
2023-04-08 03:22:40 | ERROR | stderr | WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/org/freedesktop/DisplayManager/Seat0')}
2023-04-08 03:22:40 | ERROR | stderr | WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('//debuginfod.ubuntu.com'), PosixPath('https')}
2023-04-08 03:22:40 | INFO | stdout | CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64...
2023-04-08 03:22:40 | INFO | stdout | CUDA SETUP: CUDA path found: /usr/local/cuda/lib64/libcudart.so
2023-04-08 03:22:40 | INFO | stdout | CUDA_SETUP: Detected CUDA version 121
2023-04-08 03:22:40 | INFO | stdout | CUDA_SETUP: TODO: compile library for specific version: libbitsandbytes_cuda121.so
2023-04-08 03:22:40 | INFO | stdout | CUDA_SETUP: Defaulting to libbitsandbytes.so...
Loading checkpoint shards:   0%|                                                                                                      | 0/2 [00:00<?, ?it/s]
2023-04-08 03:22:44 | ERROR | stderr | /home/kubuntu/.local/lib/python3.10/site-packages/bitsandbytes/functional.py:227: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
2023-04-08 03:22:44 | ERROR | stderr |   return ct.c_void_p(A.data.storage().data_ptr())
Loading checkpoint shards:  50%|███████████████████████████████████████████████                                               | 1/2 [00:03<00:03,  3.79s/it]
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:05<00:00,  2.35s/it]
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:05<00:00,  2.56s/it]
2023-04-08 03:22:46 | ERROR | stderr | 
2023-04-08 03:22:47 | INFO | model_worker | Register to controller
2023-04-08 03:22:47 | ERROR | stderr | INFO:     Started server process [17350]
2023-04-08 03:22:47 | ERROR | stderr | INFO:     Waiting for application startup.
2023-04-08 03:22:47 | ERROR | stderr | INFO:     Application startup complete.
2023-04-08 03:22:47 | ERROR | stderr | INFO:     Uvicorn running on http://localhost:21002 (Press CTRL+C to quit)
2023-04-08 03:23:17 | INFO | model_worker | Send heart beat. Models: ['vicuna']. Semaphore: None. global_counter: 0
2023-04-08 03:23:47 | INFO | model_worker | Send heart beat. Models: ['vicuna']. Semaphore: None. global_counter: 0
2023-04-08 03:24:17 | INFO | model_worker | Send heart beat. Models: ['vicuna']. Semaphore: None. global_counter: 0
2023-04-08 03:24:47 | INFO | model_worker | Send heart beat. Models: ['vicuna']. Semaphore: None. global_counter: 0
2023-04-08 03:25:17 | INFO | model_worker | Send heart beat. Models: ['vicuna']. Semaphore: None. global_counter: 0
2023-04-08 03:25:36 | INFO | stdout | INFO:     127.0.0.1:33804 - "POST /worker_generate_stream HTTP/1.1" 200 OK
2023-04-08 03:25:47 | INFO | model_worker | Send heart beat. Models: ['vicuna']. Semaphore: Semaphore(value=5, locked=False). global_counter: 1
2023-04-08 03:25:47 | ERROR | stderr | Exception in thread Thread-2 (heart_beat_worker):
2023-04-08 03:25:47 | ERROR | stderr | Traceback (most recent call last):
2023-04-08 03:25:47 | ERROR | stderr |   File "/home/kubuntu/anaconda3/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
2023-04-08 03:25:47 | ERROR | stderr |     self.run()
2023-04-08 03:25:47 | ERROR | stderr |   File "/home/kubuntu/anaconda3/lib/python3.10/threading.py", line 953, in run
2023-04-08 03:25:47 | ERROR | stderr |     self._target(*self._args, **self._kwargs)
2023-04-08 03:25:47 | ERROR | stderr |   File "/home/kubuntu/.local/lib/python3.10/site-packages/fastchat/serve/model_worker.py", line 39, in heart_beat_worker
2023-04-08 03:25:47 | ERROR | stderr |     controller.send_heart_beat()
2023-04-08 03:25:47 | ERROR | stderr |   File "/home/kubuntu/.local/lib/python3.10/site-packages/fastchat/serve/model_worker.py", line 94, in send_heart_beat
2023-04-08 03:25:47 | ERROR | stderr |     "queue_length": self.get_queue_length()}, timeout=5)
2023-04-08 03:25:47 | ERROR | stderr |   File "/home/kubuntu/.local/lib/python3.10/site-packages/fastchat/serve/model_worker.py", line 108, in get_queue_length
2023-04-08 03:25:47 | ERROR | stderr |     return args.limit_model_concurrency - model_semaphore._value + len(
2023-04-08 03:25:47 | ERROR | stderr | TypeError: object of type 'NoneType' has no len()
2023-04-08 03:26:22 | INFO | stdout | INFO:     127.0.0.1:37374 - "POST /worker_generate_stream HTTP/1.1" 200 OK

thanks

Apr 08 '23 01:04 daddyparodz

temporarily fixed by commenting those 2 lines after return and adding a 1 in /home/kubuntu/.local/lib/python3.10/site-packages/fastchat/serve/model_worker.py

before:

    def get_queue_length(self):
        if model_semaphore is None:
            return 0
        else:
            return args.limit_model_concurrency - model_semaphore._value + len(
                 model_semaphore._waiters)

after:

    def get_queue_length(self):
        if model_semaphore is None:
            return 0
        else:
            return 1 #args.limit_model_concurrency - model_semaphore._value + len(
                 #model_semaphore._waiters)

Apr 08 '23 01:04 daddyparodz

This is not working anymore

Apr 20 '23 05:04 linonetwo

But hopefully, if you delay the webui startup for 10-20s, this error can gone.

Apr 20 '23 14:04 linonetwo

Are you serving your model to many users? If you have much traffic, the semaphore will allow only 4 users to use.

The rest of the users have to wait until previous requests are finished. If you GPU isn't very high-end (powerful), the users will wait and get TIMEOUT, which will throw the error.

You can either increase the semaphore or set the timeout to some bigger value here: https://github.com/lm-sys/FastChat/blob/main/fastchat/serve/gradio_web_server.py#L253

Please try and let us know if you have any problem.

Apr 21 '23 02:04 zhisbug

My friend is trying to access the chatbot website I deployed on our own server. He got connection timeout error. I increased the timeout parameter to 50 and it did not help. My US friend is able to access without issue. Does anyone has any idea what could cause that and how to debug it?

May 05 '23 04:05 zxzhijia

@zxzhijia your chatbot is behind GFW?

May 08 '23 07:05 zhisbug

@zxzhijia your chatbot is behind GFW?

Probably, that's true.

May 20 '23 22:05 zxzhijia

stale issue. Closing.

Jul 05 '23 20:07 zhisbug

FastChat FastChat copied to clipboard

NETWORK ERROR DUE TO HIGH TRAFFIC. PLEASE REGENERATE OR REFRESH THIS PAGE.

FastChat
FastChat copied to clipboard