[Bug]: stream chat answer / httpx.RemoteProtocolError: peer closed connection without sending complete message body (incomplete chunked read)
Steps to reproduce
dstack run . -f examples/llms/llama3/ollama-8b.dstack.yml -b aws
I have not tested any other chat models.
After spinnung up the server i m able to test it by:
from openai import OpenAI
client = OpenAI(base_url="https://gateway.xxx.com", api_key="xxx")
completion = client.chat.completions.create(
model="llama3",
messages=[
{
"role": "user",
"content": "Compose a poem that explains the concept of recursion in programming.",
}
],
stream=True,
)
#print(completion)
for chunk in completion:
print(chunk.choices[0].delta.content, end="")
print()
Actual behaviour
If stream is set to False all works like expected. If stream is used i get an error:
Traceback (most recent call last):
File "/home/spirit/repos/dstack/dstack/test.py", line 18, in <module>
for chunk in completion:
File "/home/spirit/repos/dstack/dstack/venv/lib/python3.12/site-packages/openai/_streaming.py", line 46, in __iter__
for item in self._iterator:
File "/home/spirit/repos/dstack/dstack/venv/lib/python3.12/site-packages/openai/_streaming.py", line 58, in __stream__
for sse in iterator:
File "/home/spirit/repos/dstack/dstack/venv/lib/python3.12/site-packages/openai/_streaming.py", line 50, in _iter_events
yield from self._decoder.iter_bytes(self.response.iter_bytes())
File "/home/spirit/repos/dstack/dstack/venv/lib/python3.12/site-packages/openai/_streaming.py", line 280, in iter_bytes
for chunk in self._iter_chunks(iterator):
File "/home/spirit/repos/dstack/dstack/venv/lib/python3.12/site-packages/openai/_streaming.py", line 291, in _iter_chunks
for chunk in iterator:
File "/home/spirit/repos/dstack/dstack/venv/lib/python3.12/site-packages/httpx/_models.py", line 829, in iter_bytes
for raw_bytes in self.iter_raw():
File "/home/spirit/repos/dstack/dstack/venv/lib/python3.12/site-packages/httpx/_models.py", line 883, in iter_raw
for raw_stream_bytes in self.stream:
File "/home/spirit/repos/dstack/dstack/venv/lib/python3.12/site-packages/httpx/_client.py", line 126, in __iter__
for chunk in self._stream:
File "/home/spirit/repos/dstack/dstack/venv/lib/python3.12/site-packages/httpx/_transports/default.py", line 112, in __iter__
with map_httpcore_exceptions():
File "/usr/lib/python3.12/contextlib.py", line 158, in __exit__
self.gen.throw(value)
File "/home/spirit/repos/dstack/dstack/venv/lib/python3.12/site-packages/httpx/_transports/default.py", line 86, in map_httpcore_exceptions
raise mapped_exc(message) from exc
httpx.RemoteProtocolError: peer closed connection without sending complete message body (incomplete chunked read)
Expected behaviour
No response
dstack version
0.18.1
Server logs
No response
Additional information
So there is a problem with chunked encoding it seems. I have not checked if the encoding can be adjusted in client python, but i was able to edit the gateway nginx config to solve the problem for now.
If i set
chunked_transfer_encoding off;
in nginx.conf on my gateway server, the error is gone.
~~Interesting, I tried to reproduce it on my side and coudn't.~~
~~I wonder if it can be somehow related to the client environment?~~ cc @tnissen375
See below.
Ahh wait, actually I see the issue. Just needed to wait:
python src/main.py
In code, where logic reigns supreme,
A secret lies, both clever and extreme,
'Tis recursive thinking, a skill to refine,
Where functions call upon themselves in line.
Like Russian dolls, one inside the other deep,
Functions nest, each echoing the same sleep,
Each iteration builds upon the past,
As functions repeat, like echoes that last.
The base case breaks, the recursion unwind,
Back up the chain, till the roots entwine,
The original caller, now free to roam,
While recursive magic starts its mystic home.
Think of it as a never-ending staircase grand,
Where each step calls forth the next, hand in hand,
Until the base is reached, and the dance ceases,
As recursive code finds its peaceful release.
Yet, beware, for infinite loops can ensue,
And crashes lurk, if not carefully imbued,
With checks to terminate, before it's too late,
Lest recursive calls consume all available fate.
Thus recursion's power, in coding we adore,
A technique sublime, yet challenging to explore,
To harness its might, and craft elegant solutions rare,
Requires a grasp of this circular, recursive flair.Traceback (most recent call last):
File "/home/codespace/.local/lib/python3.10/site-packages/httpx/_transports/default.py", line 69, in map_httpcore_exceptions
yield
File "/home/codespace/.local/lib/python3.10/site-packages/httpx/_transports/default.py", line 113, in __iter__
for part in self._httpcore_stream:
File "/home/codespace/.local/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 367, in __iter__
raise exc from None
File "/home/codespace/.local/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 363, in __iter__
for part in self._stream:
File "/home/codespace/.local/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 349, in __iter__
raise exc
File "/home/codespace/.local/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 341, in __iter__
for chunk in self._connection._receive_response_body(**kwargs):
File "/home/codespace/.local/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 210, in _receive_response_body
event = self._receive_event(timeout=timeout)
File "/home/codespace/.local/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 220, in _receive_event
with map_exceptions({h11.RemoteProtocolError: RemoteProtocolError}):
File "/usr/local/python/3.10.13/lib/python3.10/contextlib.py", line 153, in __exit__
self.gen.throw(typ, value, traceback)
File "/home/codespace/.local/lib/python3.10/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
raise to_exc(exc) from exc
httpcore.RemoteProtocolError: peer closed connection without sending complete message body (incomplete chunked read)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/workspaces/dstack/src/main.py", line 16, in <module>
for chunk in completion:
File "/usr/local/python/3.10.13/lib/python3.10/site-packages/openai/_streaming.py", line 46, in __iter__
for item in self._iterator:
File "/usr/local/python/3.10.13/lib/python3.10/site-packages/openai/_streaming.py", line 58, in __stream__
for sse in iterator:
File "/usr/local/python/3.10.13/lib/python3.10/site-packages/openai/_streaming.py", line 50, in _iter_events
yield from self._decoder.iter_bytes(self.response.iter_bytes())
File "/usr/local/python/3.10.13/lib/python3.10/site-packages/openai/_streaming.py", line 280, in iter_bytes
for chunk in self._iter_chunks(iterator):
File "/usr/local/python/3.10.13/lib/python3.10/site-packages/openai/_streaming.py", line 291, in _iter_chunks
for chunk in iterator:
File "/home/codespace/.local/lib/python3.10/site-packages/httpx/_models.py", line 829, in iter_bytes
for raw_bytes in self.iter_raw():
File "/home/codespace/.local/lib/python3.10/site-packages/httpx/_models.py", line 883, in iter_raw
for raw_stream_bytes in self.stream:
File "/home/codespace/.local/lib/python3.10/site-packages/httpx/_client.py", line 126, in __iter__
for chunk in self._stream:
File "/home/codespace/.local/lib/python3.10/site-packages/httpx/_transports/default.py", line 112, in __iter__
with map_httpcore_exceptions():
File "/usr/local/python/3.10.13/lib/python3.10/contextlib.py", line 153, in __exit__
self.gen.throw(typ, value, traceback)
File "/home/codespace/.local/lib/python3.10/site-packages/httpx/_transports/default.py", line 86, in map_httpcore_exceptions
raise mapped_exc(message) from exc
httpx.RemoteProtocolError: peer closed connection without sending complete message body (incomplete chunked read
cc @jvstme
@tnissen375 BTW, could you please if possible attach the updated NGINX config! 🙏
I confirm that if I add chunked_transfer_encoding off; to /etc/nginx/sites-enabled/443-gateway.my-gateway-domain.dstack.ai.conf, the problem goes away.
Here's the final /etc/nginx/sites-enabled/443-gateway.my-gateway-domain.dstack.ai.conf file:
server {
server_name gateway.my-gateway-domain.dstack.ai;
location / {
proxy_pass http://localhost:8000/api/openai/main/;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header Host $host;
proxy_read_timeout 300s;
chunked_transfer_encoding off;
}
listen 80;
listen 443 ssl;
ssl_certificate /etc/letsencrypt/live/gateway.my-gateway-domain.dstack.ai/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/gateway.my-gateway-domain.dstack.ai/privkey.pem;
include /etc/letsencrypt/options-ssl-nginx.conf;
ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem;
set $force_https 1;
if ($scheme = "https") {
set $force_https 0;
}
if ($remote_addr = 127.0.0.1) {
set $force_https 0;
}
if ($force_https) {
return 301 https://$host$request_uri;
}
}
This issue is stale because it has been open for 30 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale. Please reopen the issue if it is still relevant.