httpcore icon indicating copy to clipboard operation
httpcore copied to clipboard

Stream at chunk boundaries with HTTP/1.1 chunked transfer encoding

Open tomchristie opened this issue 2 years ago • 2 comments

Closes https://github.com/encode/httpx/issues/1279

  • [x] Failing test cases.
  • [ ] Improved behaviour.

tomchristie avatar Oct 13 '22 15:10 tomchristie

Hrrrmmm...

Interesting. It's possible that this has exposed a bug in h11's chunk_start and chunk_end handling.

This fix works in the case shown in https://github.com/encode/httpx/issues/1279, but it fails with this test case. I'd probably start by assuming that meant the test case was faulty, but taking a look at the events generated is.... surprising.

tests/_sync/test_http11.py:233: AssertionError
------------------------------------------------------------------------ Captured stdout call ------------------------------------------------------------------------
Data(data=bytearray(b'Hello, '), chunk_start=False, chunk_end=True)
Data(data=bytearray(b'wor'), chunk_start=False, chunk_end=False)
Data(data=bytearray(b'ld!'), chunk_start=False, chunk_end=True)
EndOfMessage(headers=<Headers([])>)

We can see h11 returning the chunks without having set chunk_start=True.

Curious.

(Link to relevant part of h11 parsing code... https://github.com/python-hyper/h11/blob/a7bdffcb7c6f869390dc1a361202417e9b7ecc6d/h11/_readers.py)

tomchristie avatar Oct 13 '22 15:10 tomchristie

Yup.

Here's h11 handling the start/end correctly...

>>> from h11._receivebuffer import ReceiveBuffer
>>> from h11._readers import ChunkedReader
>>>
>>> b = ReceiveBuffer()
>>> c = ChunkedReader()
>>>
>>> b += b"7\r\nHello, \r\n6\r\nworld!\r\n0\r\n"
>>> c(b)
Data(data=bytearray(b'Hello, '), chunk_start=True, chunk_end=True)
>>> c(b)
Data(data=bytearray(b'world!'), chunk_start=True, chunk_end=True)
>>> c(b)
>>>

Now let's break it...

>>> from h11._receivebuffer import ReceiveBuffer
>>> from h11._readers import ChunkedReader
>>>
>>> b = ReceiveBuffer()
>>> c = ChunkedReader()
>>>
>>> b += b"7\r\n"
>>> c(b)
>>> b += b"Hello, \r\n"
>>> c(b)
Data(data=bytearray(b'Hello, '), chunk_start=False, chunk_end=True)
>>> b += b"6\r\n"
>>> c(b)
>>> b += b"wor"
>>> c(b)
Data(data=bytearray(b'wor'), chunk_start=False, chunk_end=False)
>>> b += b"ld!\r\n"
>>> c(b)
Data(data=bytearray(b'ld!'), chunk_start=False, chunk_end=True)
>>> b += b"0\r\n"
>>> c(b)
>>>

tomchristie avatar Oct 13 '22 15:10 tomchristie

Closing as per https://github.com/encode/httpx/issues/1279

If anyone genuinely has a need for this, then here's the tools/info to get things started.

tomchristie avatar Oct 25 '22 09:10 tomchristie