FastChat
FastChat copied to clipboard
deepseek-coder-33b-instruct model with openai got "InvalidChunkLength" error
Use FastChat to start the deepseek-coder-33b-instruct model, send a stream request and got an error response. If set stream=False, you can print a good response If change to other models, it also works with stream Start cmd:
python3 -m fastchat.serve.controller
python3 -m fastchat.serve.openai_api_server --host 0.0.0.0 --port 8000
python3 -m fastchat.serve.model_worker --model-path /home/ubuntu/models/deepseek-coder-33b-instruct --num-gpus=2 --gpus=0,1 --max-gpu-memory=46GB --model-names=deepseek-coder-33b-instruct
Below python code for test:
import openai
if __name__ == "__main__":
openai.api_base = "http://model_addr:8000/v1"
openai.api_key = "none"
response = openai.ChatCompletion.create(
model="deepseek-coder-33b-instruct",
temperature=0.6,
top_p=1,
messages=[{"role": "user", "content": "Hello"}],
stream=True
)
print(response)
for chunk in response:
print(chunk, end="", flush=True)
error log:
<generator object EngineAPIResource.create.<locals>.<genexpr> at 0x1030ae900>
Traceback (most recent call last):
File "/Users/yangglei/miniconda3/envs/my-env/lib/python3.9/site-packages/urllib3/response.py", line 712, in _error_catcher
yield
File "/Users/yangglei/miniconda3/envs/my-env/lib/python3.9/site-packages/urllib3/response.py", line 1071, in read_chunked
self._update_chunk_length()
File "/Users/yangglei/miniconda3/envs/my-env/lib/python3.9/site-packages/urllib3/response.py", line 1006, in _update_chunk_length
raise InvalidChunkLength(self, line) from None
urllib3.exceptions.InvalidChunkLength: InvalidChunkLength(got length b'', 0 bytes read)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/Users/yangglei/miniconda3/envs/my-env/lib/python3.9/site-packages/requests/models.py", line 816, in generate
yield from self.raw.stream(chunk_size, decode_content=True)
File "/Users/yangglei/miniconda3/envs/my-env/lib/python3.9/site-packages/urllib3/response.py", line 931, in stream
yield from self.read_chunked(amt, decode_content=decode_content)
File "/Users/yangglei/miniconda3/envs/my-env/lib/python3.9/site-packages/urllib3/response.py", line 1100, in read_chunked
self._original_response.close()
File "/Users/yangglei/miniconda3/envs/my-env/lib/python3.9/contextlib.py", line 137, in __exit__
self.gen.throw(typ, value, traceback)
File "/Users/yangglei/miniconda3/envs/my-env/lib/python3.9/site-packages/urllib3/response.py", line 729, in _error_catcher
raise ProtocolError(f"Connection broken: {e!r}", e) from e
urllib3.exceptions.ProtocolError: ("Connection broken: InvalidChunkLength(got length b'', 0 bytes read)", InvalidChunkLength(got length b'', 0 bytes read))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/yangglei/work/projects/go/src/python/test-continue-2.py", line 34, in <module>
for chunk in response:
File "/Users/yangglei/miniconda3/envs/my-env/lib/python3.9/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 166, in <genexpr>
return (
File "/Users/yangglei/miniconda3/envs/my-env/lib/python3.9/site-packages/openai/api_requestor.py", line 692, in <genexpr>
return (
File "/Users/yangglei/miniconda3/envs/my-env/lib/python3.9/site-packages/openai/api_requestor.py", line 115, in parse_stream
for line in rbody:
File "/Users/yangglei/miniconda3/envs/my-env/lib/python3.9/site-packages/requests/models.py", line 865, in iter_lines
for chunk in self.iter_content(
File "/Users/yangglei/miniconda3/envs/my-env/lib/python3.9/site-packages/requests/models.py", line 818, in generate
raise ChunkedEncodingError(e)
requests.exceptions.ChunkedEncodingError: ("Connection broken: InvalidChunkLength(got length b'', 0 bytes read)", InvalidChunkLength(got length b'', 0 bytes read))