daphne
daphne copied to clipboard
HTTP 1.x chunked transfer encoding + content length header conflict
My application serves (large) binary files. It serves those files with the following headers:
Content-Length: 163840
Content-Type: application/octet-stream
Transfer-Encoding: chunked
When the request is handled over HTTP 2.x, this does not result in any issues. However, upon fallback to HTTP 1.x, this results in an error. The issue can be replicated by executing curl -X GET http://host/file.db
, forcing the fallback by using HTTP 1.x.
The following error is returned:
curl: (56) Illegal or missing hexadecimal sequence in chunked-encoding
Some web servers, like Uvicorn for example, solve this by ignoring the content-length header when any transfer-encoding is set. Would it be desirable to replicate this behavior in Daphne?
Daphne version: 3.0.2 Python 3.9.5
Hi @eelkevdbos — thanks for the report, yes:
chunked Data is sent in a series of chunks. The Content-Length header is omitted in this case and at the beginning of each chunk you need to add the length of the current chunk in hexadecimal format, followed by '\r\n' and then the chunk itself, followed by another '\r\n'. The terminating chunk is a regular chunk, with the exception that its length is zero. It is followed by the trailer, which consists of a (possibly empty) sequence of header fields.
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Transfer-Encoding#directives
@carltongibson, I was thinking about writing a patch that checks for the above condition in http_protocol.py#L151 removing the content-length header if a chunked transfer encoding is found as well. Do you think the above mentioned location is suitable for this kind of operation?
Hey @eelkevdbos — Grrr. 😬
Looking at the way gunicorn handles this if the client sets Content-Length
then we shouldn't use chunked encoding at all. ...
guicorn takes it as its responsibility to set the Transfer-Encoding
header.
🤔
But yes 😀 that would be an appropriate place to add handling, once we're clear on the best way forward.
RFC 9112, section 6.2 says this:
A sender MUST NOT send a Content-Length header field in any message that contains a Transfer-Encoding header field.
Therefore, Daphne is obligated to either remove the Content-Length
header or buffer the message body and re-emit it de-chunked, with the Content-Length
intact.
@kenballus The latter option there is what gunicorn is doing right?
Would you fancy picking this one up?
Hello @carltongibson @kenballus! I arrived to this issue while trying to triage Django ticket 35289, which is about "Chunked transfer encoding is not handled correctly by MultiPartParser". In that ticket, the reporter says that the obsoleted (RFC 2616 section 4.4) says the same thing (Content-Length
MUST be ignored when Transfer-Encoding
is sent). But then the latest RFC 9112 also referenced here says:
Early implementations of Transfer-Encoding would occasionally send both a chunked transfer coding for message framing and an estimated Content-Length header field for use by progress bars. This is why Transfer-Encoding is defined as overriding Content-Length, as opposed to them being mutually incompatible.
Would you understand how the paragraph above plays with what was said in the previous comment?
@nessita Nice find thank you.
Without digging too much, I think we still need to adjust the behaviour here to mirror that of other servers (gunicorn/uvicorn/&co)
I think @eelkevdbos's:
I was thinking about writing a patch that checks for the above condition in http_protocol.py#L151 removing the content-length header if a chunked transfer encoding is found as well.
... is likely right. (Assuming the application already processed the response, which is why it set the header.)