daphne icon indicating copy to clipboard operation
daphne copied to clipboard

HTTP 1.x chunked transfer encoding + content length header conflict

Open eelkevdbos opened this issue 3 years ago • 8 comments

My application serves (large) binary files. It serves those files with the following headers:

Content-Length: 163840
Content-Type: application/octet-stream
Transfer-Encoding: chunked

When the request is handled over HTTP 2.x, this does not result in any issues. However, upon fallback to HTTP 1.x, this results in an error. The issue can be replicated by executing curl -X GET http://host/file.db, forcing the fallback by using HTTP 1.x.

The following error is returned:

curl: (56) Illegal or missing hexadecimal sequence in chunked-encoding

Some web servers, like Uvicorn for example, solve this by ignoring the content-length header when any transfer-encoding is set. Would it be desirable to replicate this behavior in Daphne?


Daphne version: 3.0.2 Python 3.9.5

eelkevdbos avatar Jun 14 '21 01:06 eelkevdbos

Hi @eelkevdbos — thanks for the report, yes:

chunked Data is sent in a series of chunks. The Content-Length header is omitted in this case and at the beginning of each chunk you need to add the length of the current chunk in hexadecimal format, followed by '\r\n' and then the chunk itself, followed by another '\r\n'. The terminating chunk is a regular chunk, with the exception that its length is zero. It is followed by the trailer, which consists of a (possibly empty) sequence of header fields.

https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Transfer-Encoding#directives

carltongibson avatar Jun 14 '21 09:06 carltongibson

@carltongibson, I was thinking about writing a patch that checks for the above condition in http_protocol.py#L151 removing the content-length header if a chunked transfer encoding is found as well. Do you think the above mentioned location is suitable for this kind of operation?

eelkevdbos avatar Jun 14 '21 18:06 eelkevdbos

Hey @eelkevdbos — Grrr. 😬

Looking at the way gunicorn handles this if the client sets Content-Length then we shouldn't use chunked encoding at all. ...

guicorn takes it as its responsibility to set the Transfer-Encoding header.

🤔

carltongibson avatar Jun 16 '21 09:06 carltongibson

But yes 😀 that would be an appropriate place to add handling, once we're clear on the best way forward.

carltongibson avatar Jun 16 '21 09:06 carltongibson

RFC 9112, section 6.2 says this:

A sender MUST NOT send a Content-Length header field in any message that contains a Transfer-Encoding header field.

Therefore, Daphne is obligated to either remove the Content-Length header or buffer the message body and re-emit it de-chunked, with the Content-Length intact.

kenballus avatar Feb 04 '24 19:02 kenballus

@kenballus The latter option there is what gunicorn is doing right?

Would you fancy picking this one up?

carltongibson avatar Feb 05 '24 06:02 carltongibson

Hello @carltongibson @kenballus! I arrived to this issue while trying to triage Django ticket 35289, which is about "Chunked transfer encoding is not handled correctly by MultiPartParser". In that ticket, the reporter says that the obsoleted (RFC 2616 section 4.4) says the same thing (Content-Length MUST be ignored when Transfer-Encoding is sent). But then the latest RFC 9112 also referenced here says:

Early implementations of Transfer-Encoding would occasionally send both a chunked transfer coding for message framing and an estimated Content-Length header field for use by progress bars. This is why Transfer-Encoding is defined as overriding Content-Length, as opposed to them being mutually incompatible.

Would you understand how the paragraph above plays with what was said in the previous comment?

nessita avatar Mar 18 '24 12:03 nessita

@nessita Nice find thank you.

Without digging too much, I think we still need to adjust the behaviour here to mirror that of other servers (gunicorn/uvicorn/&co)

I think @eelkevdbos's:

I was thinking about writing a patch that checks for the above condition in http_protocol.py#L151 removing the content-length header if a chunked transfer encoding is found as well.

... is likely right. (Assuming the application already processed the response, which is why it set the header.)

carltongibson avatar Mar 18 '24 13:03 carltongibson