gunicorn Discrepancies in parsing HTTP requests

Discrepancies in parsing HTTP requests

Open blessingcharles opened this issue 3 years ago • 1 comments

Current Behavior

Invalid HTTP versions got accepted matching the regex [\d.\d]+ eg : 0.2.1.0.0.3.0.0.91
plus sign before content length value got accepted , it got transformed as +33 to 33 and the body is interpreted as length of 33
Transfer-Encoding header got accepted in version 0.9 and 1.0 which is not available according to RFC spec
\x0b \x0c \x1c \x1d \x1e \x1f are accepted between field-value and colon seperator , but according to RFC whitespaces where only allowed .

Expected Behavior

According to RFC Protocol versioning DIGIT "." DIGIT is the ABNF grammar , it should be followed to parse the requests
According to RFC Content Length 1*DIGIT is the ABNF grammar
For 0.9 and 1.0 chunked encoding should not be interpreted , as its undefined in RFC spec of http 0.9 and 1.0 , but its used over content-length in gunicorn .
According to RFC Header fields , header-field = field-name ":" OWS field-value OWS is the ABNF grammar but if we give \x0b \x0c \x1c \x1d \x1e \x1f instead of white spaces its also get treated normally. So if a proxy ignores [\x1c]chunked and forwards to downstream by parsing content-length and gunicorn incorrectly parses the [\x1c]chunked as chunked request it may lead to http request smuggling

Steps to Reproduce (for bugs)

In HTTP-Version , any regex matching [\d.\d]+

echo -ne "GET / HTTP/0.0.0.0.3.0.0.91\r\nContent-Length: 3\r\n\r\naaa" | nc localhost 8002

- sign accepted in content length field value

echo -ne "GET / HTTP/1.1\r\nContent-Length: +3\r\n\r\naaa" | nc localhost 8002

For 0.9 and 1.0 chunked encoding should not be interpreted but its used over content-length

echo -ne "GET / HTTP/0.9\r\nHost: localhost\r\nTransfer-Encoding: chunked\r\nContent-Length: 13\r\n\r\n2\r\naa\r\n0\r\n\r\nX" | nc localhost 8002

\x0b \x0c \x1c \x1d \x1e \x1f are accepted between field-value and colon seperator

In content Length

echo -ne "GET / HTTP/1.1\r\nContent-Length:\x0c3\r\n\r\naaa" | nc localhost 8002

In Transfer Encoding

echo -ne "GET / HTTP/1.1\r\nHost: localhost\r\nTransfer-Encoding:\x1cchunked\r\n\r\n3\r\naaa\r\n0\r\n\r\n" | nc localhost 8002

Environment

Gunicorn version : gunicorn 20.1.0

Docker Environment used

FROM python
RUN pip install gunicorn
WORKDIR /app
COPY app.py .
CMD ["gunicorn", "-b", "0.0.0.0:80", "app:app"]

Simple python server

def app(environ, start_response):
    body = environ["wsgi.input"].read()
    data = b"Body length: " + str(len(body)).encode() + b" Body: " + repr(body).encode()
    start_response("200 OK", [
        ("Content-Type", "text/plain"),
        ("Content-Length", str(len(data)))
    ])
    return iter([data])

May 30 '22 06:05 blessingcharles

I will check, but according to the code it shouldn't be, order looks correct. As for For 0.9 and 1.0 chunked encoding should not be interpreted , as its undefined in RFC spec of http 0.9 and 1.0 , but its used over content-length in gunicorn . Some clients were expecting that differently.

Aug 06 '22 16:08 benoitc

gunicorn gunicorn copied to clipboard

Discrepancies in parsing HTTP requests

Current Behavior

Expected Behavior

Steps to Reproduce (for bugs)

Environment

gunicorn
gunicorn copied to clipboard