httpx icon indicating copy to clipboard operation
httpx copied to clipboard

RemoteProtocolError: multiple Transfer-Encoding headers

Open cancan101 opened this issue 4 years ago • 11 comments
trafficstars

Running:

httpx.get('https://adobe.wd5.myworkdayjobs.com/en-US/external_experienced/job/San-Jose/Director--Corporate-Strategy_R109552-2')

produces RemoteProtocolError: multiple Transfer-Encoding headers.

Using requests, works fine:

requests.get('https://adobe.wd5.myworkdayjobs.com/en-US/external_experienced/job/San-Jose/Director--Corporate-Strategy_R109552-2').headers
{'transfer-encoding': 'chunked, chunked', 'Date': 'Tue, 14 Sep 2021 17:10:47 GMT', 'Vary': 'origin,access-control-request-method,access-control-request-headers,accept-encoding', 'Server': 'Workday User Interface Service', 'Set-Cookie': 'wd-browser-id=042c7ebd-81b7-429a-965d-05552e9dd259; Path=/; Secure; HTTPOnly, PLAY_SESSION=4920b9e5c981a4b74626cbea94ca58a36187f668-adobe_pSessionId=b44gglc1l9ehb4jtu4lr4q7blu&instance=wd5prvps0004d; SameSite=Lax; Path=/; Secure; HTTPOnly, wday_vps_cookie=2938515978.61490.0000; path=/; Httponly; Secure; SameSite=none', 'Content-Type': 'text/html;charset=UTF-8', 'X-Frame-Options': 'DENY', 'X-WD-REQUEST-ID': 'VPS|aecdf787-0e93-4a4f-ba57-6e1020d1eb00', 'Content-Encoding': 'gzip', 'Content-Language': 'en-US', 'Strict-Transport-Security': 'max-age=31536000; includeSubDomains'}

cancan101 avatar Sep 14 '21 17:09 cancan101

Thanks @cancan101 - looks like a case of h11 being (reasonably enough) stricter about erroring on a non-compliant response. However, really we would like to be able to be robust to this kind of thing.

lovelydinosaur avatar Dec 16 '21 12:12 lovelydinosaur

Multiple Transfer-Encoding headers are explicitly forbidden in specs. But browsers, and many libraries, accept them; so this is a case that we may have to deal with at some point.

To ease this restriction you can monkey-patch h11._headers.normalize_and_validate.

def _patch(headers, _parsed=False):
    ...

import httpx
# monkey-patch AsyncClient
httpx._transports.default.httpcore._async.http11.h11._headers.normalize_and_validate = _patch
# monkey-patch Client? (not tested)
httpx._transports.default.httpcore._sync.http11.h11._headers.normalize_and_validate = _patch
Here's mine. It only accepts multiple {'transfer-encoding': 'chunked'} headers.
import re

from h11._abnf import field_name, field_value
from h11._util import bytesify, LocalProtocolError, validate
from h11._headers import Headers

_content_length_re = re.compile(br"[0-9]+")
_field_name_re = re.compile(field_name.encode("ascii"))
_field_value_re = re.compile(field_value.encode("ascii"))


def _patch(headers, _parsed: bool=False):
    new_headers = []
    seen_content_length = None
    saw_transfer_encoding = False
    for name, value in headers:
        # For headers coming out of the parser, we can safely skip some steps,
        # because it always returns bytes and has already run these regexes
        # over the data:
        if not _parsed:
            name = bytesify(name)
            value = bytesify(value)
            validate(_field_name_re, name, "Illegal header name {!r}", name)
            validate(_field_value_re, value, "Illegal header value {!r}", value)
        assert isinstance(name, bytes)
        assert isinstance(value, bytes)

        raw_name = name
        name = name.lower()
        if name == b"content-length":
            lengths = {length.strip() for length in value.split(b",")}
            if len(lengths) != 1:
                raise LocalProtocolError("conflicting Content-Length headers")
            value = lengths.pop()
            validate(_content_length_re, value, "bad Content-Length")
            if seen_content_length is None:
                seen_content_length = value
                new_headers.append((raw_name, name, value))
            elif seen_content_length != value:
                raise LocalProtocolError("conflicting Content-Length headers")
        elif name == b"transfer-encoding":
            # "A server that receives a request message with a transfer coding
            # it does not understand SHOULD respond with 501 (Not
            # Implemented)."
            # https://tools.ietf.org/html/rfc7230#section-3.3.1
            if saw_transfer_encoding:
                if saw_transfer_encoding == value:
                    continue
                raise LocalProtocolError(
                    "multiple Transfer-Encoding headers", error_status_hint=501
                )
            # "All transfer-coding names are case-insensitive"
            # -- https://tools.ietf.org/html/rfc7230#section-4
            value = value.lower()
            if value != b"chunked":
                raise LocalProtocolError(
                    "Only Transfer-Encoding: chunked is supported",
                    error_status_hint=501,
                )
            saw_transfer_encoding = value
            new_headers.append((raw_name, name, value))
        else:
            new_headers.append((raw_name, name, value))
    return Headers(new_headers)

nuno-andre avatar Jan 26 '22 21:01 nuno-andre

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Feb 26 '22 00:02 stale[bot]

Boop

lovelydinosaur avatar Mar 01 '22 09:03 lovelydinosaur

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Sep 09 '22 01:09 stale[bot]

any update on a fix?

logan-vitelity avatar Feb 24 '23 19:02 logan-vitelity

@logan-vitelity This is actually an h11 issue. It seems that they are open to loose the restriction, but for the time being, there are no changes.

At the moment the only solution is the monkeypatching.

nuno-andre avatar Feb 24 '23 19:02 nuno-andre

Just a suggestion for someone who face the same problem:

  1. install h11
  2. manually change the code of h11, there are 2 snippet to change:
## _headers.py line 89
            if saw_transfer_encoding:
                pass
                # raise LocalProtocolError("multiple Transfer-Encoding headers",
                #                         error_status_hint=501)
## _connection.py line 88
    if transfer_encodings:
        assert transfer_encodings == [b"chunked"] or b"chunked" in transfer_encodings
        return ("chunked", ())

Fnck avatar Jun 08 '23 07:06 Fnck