gunicorn
gunicorn copied to clipboard
Discrepancies in parsing HTTP requests
Current Behavior
- Invalid HTTP versions got accepted matching the regex [\d.\d]+ eg : 0.2.1.0.0.3.0.0.91
- plus sign before content length value got accepted , it got transformed as +33 to 33 and the body is interpreted as length of 33
- Transfer-Encoding header got accepted in version 0.9 and 1.0 which is not available according to RFC spec
- \x0b \x0c \x1c \x1d \x1e \x1f are accepted between field-value and colon seperator , but according to RFC whitespaces where only allowed .
Expected Behavior
- According to RFC Protocol versioning DIGIT "." DIGIT is the ABNF grammar , it should be followed to parse the requests
- According to RFC Content Length 1*DIGIT is the ABNF grammar
- For 0.9 and 1.0 chunked encoding should not be interpreted , as its undefined in RFC spec of http 0.9 and 1.0 , but its used over content-length in gunicorn .
- According to RFC Header fields , header-field = field-name ":" OWS field-value OWS is the ABNF grammar but if we give \x0b \x0c \x1c \x1d \x1e \x1f instead of white spaces its also get treated normally. So if a proxy ignores [\x1c]chunked and forwards to downstream by parsing content-length and gunicorn incorrectly parses the [\x1c]chunked as chunked request it may lead to http request smuggling
Steps to Reproduce (for bugs)
- In HTTP-Version , any regex matching [\d.\d]+
echo -ne "GET / HTTP/0.0.0.0.3.0.0.91\r\nContent-Length: 3\r\n\r\naaa" | nc localhost 8002
-
- sign accepted in content length field value
echo -ne "GET / HTTP/1.1\r\nContent-Length: +3\r\n\r\naaa" | nc localhost 8002
- For 0.9 and 1.0 chunked encoding should not be interpreted but its used over content-length
echo -ne "GET / HTTP/0.9\r\nHost: localhost\r\nTransfer-Encoding: chunked\r\nContent-Length: 13\r\n\r\n2\r\naa\r\n0\r\n\r\nX" | nc localhost 8002
- \x0b \x0c \x1c \x1d \x1e \x1f are accepted between field-value and colon seperator
- In content Length
echo -ne "GET / HTTP/1.1\r\nContent-Length:\x0c3\r\n\r\naaa" | nc localhost 8002
- In Transfer Encoding
echo -ne "GET / HTTP/1.1\r\nHost: localhost\r\nTransfer-Encoding:\x1cchunked\r\n\r\n3\r\naaa\r\n0\r\n\r\n" | nc localhost 8002
Environment
Gunicorn version : gunicorn 20.1.0
- Docker Environment used
FROM python
RUN pip install gunicorn
WORKDIR /app
COPY app.py .
CMD ["gunicorn", "-b", "0.0.0.0:80", "app:app"]
- Simple python server
def app(environ, start_response):
body = environ["wsgi.input"].read()
data = b"Body length: " + str(len(body)).encode() + b" Body: " + repr(body).encode()
start_response("200 OK", [
("Content-Type", "text/plain"),
("Content-Length", str(len(data)))
])
return iter([data])
I will check, but according to the code it shouldn't be, order looks correct. As for For 0.9 and 1.0 chunked encoding should not be interpreted , as its undefined in RFC spec of http 0.9 and 1.0 , but its used over content-length in gunicorn . Some clients were expecting that differently.