DeepSpeedExamples icon indicating copy to clipboard operation
DeepSpeedExamples copied to clipboard

run-example.sh fails with urllib3.exceptions.ProtocolError: Response ended prematurely

Open awan-10 opened this issue 2 months ago • 8 comments

When I modified “run_example.sh” and changed backend to vllm,

I got the error message down below, I will do some some check whether the error comes from server side or client side.

I notice this benchmark has three modes: “mii”, “vllm”, “aml”, in which mii and vllm is serving frame work and aml mode corresponding to benchmark an API server on Azure. Is it possible to run this script to benchmark a local API server? I kind of thinking run vllm serving in separate command, and use this benchmark to test the api server vllm started. So I would have better control on how the vllm server started and see all the error message from vllm server if it fails.

(vllm) [gma@spr02 mii]$ bash ./run_vllm.sh

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.

Token indices sequence length is longer than the specified maximum sequence length for this model (5883 > 4096). Running this sequence through the model will result in indexing errors

warmup queue size: 37 (1070543)

Process Process-1:

Traceback (most recent call last):

File "/home/gma/anaconda3/envs/vllm/lib/python3.11/site-packages/requests/models.py", line 816, in generate

yield from self.raw.stream(chunk_size, decode_content=True)

File "/home/gma/anaconda3/envs/vllm/lib/python3.11/site-packages/urllib3/response.py", line 1040, in stream

yield from self.read_chunked(amt, decode_content=decode_content)

File "/home/gma/anaconda3/envs/vllm/lib/python3.11/site-packages/urllib3/response.py", line 1184, in read_chunked

self._update_chunk_length()

File "/home/gma/anaconda3/envs/vllm/lib/python3.11/site-packages/urllib3/response.py", line 1119, in _update_chunk_length

raise ProtocolError("Response ended prematurely") from None

urllib3.exceptions.ProtocolError: Response ended prematurely

awan-10 avatar Apr 29 '24 16:04 awan-10