dstack icon indicating copy to clipboard operation
dstack copied to clipboard

[Bug]: stream chat answer / httpx.RemoteProtocolError: peer closed connection without sending complete message body (incomplete chunked read)

Open tnissen375 opened this issue 1 year ago • 4 comments

Steps to reproduce

dstack run . -f examples/llms/llama3/ollama-8b.dstack.yml -b aws I have not tested any other chat models.

After spinnung up the server i m able to test it by:

from openai import OpenAI

client = OpenAI(base_url="https://gateway.xxx.com", api_key="xxx")

completion = client.chat.completions.create(
    model="llama3",
    messages=[
        {
            "role": "user",
            "content": "Compose a poem that explains the concept of recursion in programming.",
        }
    ],
    stream=True,
)

#print(completion)

for chunk in completion:
    print(chunk.choices[0].delta.content, end="")
print()

Actual behaviour

If stream is set to False all works like expected. If stream is used i get an error:

Traceback (most recent call last):
  File "/home/spirit/repos/dstack/dstack/test.py", line 18, in <module>
    for chunk in completion:
  File "/home/spirit/repos/dstack/dstack/venv/lib/python3.12/site-packages/openai/_streaming.py", line 46, in __iter__
    for item in self._iterator:
  File "/home/spirit/repos/dstack/dstack/venv/lib/python3.12/site-packages/openai/_streaming.py", line 58, in __stream__
    for sse in iterator:
  File "/home/spirit/repos/dstack/dstack/venv/lib/python3.12/site-packages/openai/_streaming.py", line 50, in _iter_events
    yield from self._decoder.iter_bytes(self.response.iter_bytes())
  File "/home/spirit/repos/dstack/dstack/venv/lib/python3.12/site-packages/openai/_streaming.py", line 280, in iter_bytes
    for chunk in self._iter_chunks(iterator):
  File "/home/spirit/repos/dstack/dstack/venv/lib/python3.12/site-packages/openai/_streaming.py", line 291, in _iter_chunks
    for chunk in iterator:
  File "/home/spirit/repos/dstack/dstack/venv/lib/python3.12/site-packages/httpx/_models.py", line 829, in iter_bytes
    for raw_bytes in self.iter_raw():
  File "/home/spirit/repos/dstack/dstack/venv/lib/python3.12/site-packages/httpx/_models.py", line 883, in iter_raw
    for raw_stream_bytes in self.stream:
  File "/home/spirit/repos/dstack/dstack/venv/lib/python3.12/site-packages/httpx/_client.py", line 126, in __iter__
    for chunk in self._stream:
  File "/home/spirit/repos/dstack/dstack/venv/lib/python3.12/site-packages/httpx/_transports/default.py", line 112, in __iter__
    with map_httpcore_exceptions():
  File "/usr/lib/python3.12/contextlib.py", line 158, in __exit__
    self.gen.throw(value)
  File "/home/spirit/repos/dstack/dstack/venv/lib/python3.12/site-packages/httpx/_transports/default.py", line 86, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.RemoteProtocolError: peer closed connection without sending complete message body (incomplete chunked read)

Expected behaviour

No response

dstack version

0.18.1

Server logs

No response

Additional information

So there is a problem with chunked encoding it seems. I have not checked if the encoding can be adjusted in client python, but i was able to edit the gateway nginx config to solve the problem for now.

If i set chunked_transfer_encoding off; in nginx.conf on my gateway server, the error is gone.

tnissen375 avatar May 12 '24 12:05 tnissen375

~~Interesting, I tried to reproduce it on my side and coudn't.~~

~~I wonder if it can be somehow related to the client environment?~~ cc @tnissen375

See below.

peterschmidt85 avatar May 13 '24 10:05 peterschmidt85

Ahh wait, actually I see the issue. Just needed to wait:

python src/main.py
In code, where logic reigns supreme,
A secret lies, both clever and extreme,
'Tis recursive thinking, a skill to refine,
Where functions call upon themselves in line.

Like Russian dolls, one inside the other deep,
Functions nest, each echoing the same sleep,
Each iteration builds upon the past,
As functions repeat, like echoes that last.

The base case breaks, the recursion unwind,
Back up the chain, till the roots entwine,
The original caller, now free to roam,
While recursive magic starts its mystic home.

Think of it as a never-ending staircase grand,
Where each step calls forth the next, hand in hand,
Until the base is reached, and the dance ceases,
As recursive code finds its peaceful release.

Yet, beware, for infinite loops can ensue,
And crashes lurk, if not carefully imbued,
With checks to terminate, before it's too late,
Lest recursive calls consume all available fate.

Thus recursion's power, in coding we adore,
A technique sublime, yet challenging to explore,
To harness its might, and craft elegant solutions rare,
Requires a grasp of this circular, recursive flair.Traceback (most recent call last):
  File "/home/codespace/.local/lib/python3.10/site-packages/httpx/_transports/default.py", line 69, in map_httpcore_exceptions
    yield
  File "/home/codespace/.local/lib/python3.10/site-packages/httpx/_transports/default.py", line 113, in __iter__
    for part in self._httpcore_stream:
  File "/home/codespace/.local/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 367, in __iter__
    raise exc from None
  File "/home/codespace/.local/lib/python3.10/site-packages/httpcore/_sync/connection_pool.py", line 363, in __iter__
    for part in self._stream:
  File "/home/codespace/.local/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 349, in __iter__
    raise exc
  File "/home/codespace/.local/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 341, in __iter__
    for chunk in self._connection._receive_response_body(**kwargs):
  File "/home/codespace/.local/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 210, in _receive_response_body
    event = self._receive_event(timeout=timeout)
  File "/home/codespace/.local/lib/python3.10/site-packages/httpcore/_sync/http11.py", line 220, in _receive_event
    with map_exceptions({h11.RemoteProtocolError: RemoteProtocolError}):
  File "/usr/local/python/3.10.13/lib/python3.10/contextlib.py", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/home/codespace/.local/lib/python3.10/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
    raise to_exc(exc) from exc
httpcore.RemoteProtocolError: peer closed connection without sending complete message body (incomplete chunked read)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/workspaces/dstack/src/main.py", line 16, in <module>
    for chunk in completion:
  File "/usr/local/python/3.10.13/lib/python3.10/site-packages/openai/_streaming.py", line 46, in __iter__
    for item in self._iterator:
  File "/usr/local/python/3.10.13/lib/python3.10/site-packages/openai/_streaming.py", line 58, in __stream__
    for sse in iterator:
  File "/usr/local/python/3.10.13/lib/python3.10/site-packages/openai/_streaming.py", line 50, in _iter_events
    yield from self._decoder.iter_bytes(self.response.iter_bytes())
  File "/usr/local/python/3.10.13/lib/python3.10/site-packages/openai/_streaming.py", line 280, in iter_bytes
    for chunk in self._iter_chunks(iterator):
  File "/usr/local/python/3.10.13/lib/python3.10/site-packages/openai/_streaming.py", line 291, in _iter_chunks
    for chunk in iterator:
  File "/home/codespace/.local/lib/python3.10/site-packages/httpx/_models.py", line 829, in iter_bytes
    for raw_bytes in self.iter_raw():
  File "/home/codespace/.local/lib/python3.10/site-packages/httpx/_models.py", line 883, in iter_raw
    for raw_stream_bytes in self.stream:
  File "/home/codespace/.local/lib/python3.10/site-packages/httpx/_client.py", line 126, in __iter__
    for chunk in self._stream:
  File "/home/codespace/.local/lib/python3.10/site-packages/httpx/_transports/default.py", line 112, in __iter__
    with map_httpcore_exceptions():
  File "/usr/local/python/3.10.13/lib/python3.10/contextlib.py", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/home/codespace/.local/lib/python3.10/site-packages/httpx/_transports/default.py", line 86, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.RemoteProtocolError: peer closed connection without sending complete message body (incomplete chunked read

cc @jvstme

peterschmidt85 avatar May 13 '24 10:05 peterschmidt85

@tnissen375 BTW, could you please if possible attach the updated NGINX config! 🙏

peterschmidt85 avatar May 13 '24 10:05 peterschmidt85

I confirm that if I add chunked_transfer_encoding off; to /etc/nginx/sites-enabled/443-gateway.my-gateway-domain.dstack.ai.conf, the problem goes away.

Here's the final /etc/nginx/sites-enabled/443-gateway.my-gateway-domain.dstack.ai.conf file:

server {
    server_name gateway.my-gateway-domain.dstack.ai;
    location / {
        proxy_pass http://localhost:8000/api/openai/main/;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header Host $host;
        proxy_read_timeout 300s;
        chunked_transfer_encoding off;
    }
    listen 80;
    listen 443 ssl;
    ssl_certificate /etc/letsencrypt/live/gateway.my-gateway-domain.dstack.ai/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/gateway.my-gateway-domain.dstack.ai/privkey.pem;
    include /etc/letsencrypt/options-ssl-nginx.conf;
    ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem;
    set $force_https 1;
    if ($scheme = "https") {
        set $force_https 0;
    }
    if ($remote_addr = 127.0.0.1) {
        set $force_https 0;
    }
    if ($force_https) {
        return 301 https://$host$request_uri;
    }
}

peterschmidt85 avatar May 13 '24 10:05 peterschmidt85

This issue is stale because it has been open for 30 days with no activity.

peterschmidt85 avatar Jun 14 '24 01:06 peterschmidt85

This issue was closed because it has been inactive for 14 days since being marked as stale. Please reopen the issue if it is still relevant.

peterschmidt85 avatar Jun 28 '24 01:06 peterschmidt85