h2
h2 copied to clipboard
hyper should validate header fields on a header block with the CONNECT method
In HTTP/2, the CONNECT method is used to establish a tunnel over a single HTTP/2 stream to a remote host for similar purposes. The HTTP header field mapping works as defined in Section 8.1.2.3 ("Request Pseudo-Header Fields"), with a few differences. Specifically:
- The ":method" pseudo-header field is set to "CONNECT".
- The ":scheme" and ":path" pseudo-header fields MUST be omitted.
- The ":authority" pseudo-header field contains the host and port to connect to (equivalent to the authority-form of the request-target of CONNECT requests (see [RFC7230], Section 5.3)).
A CONNECT request that does not conform to these restrictions is malformed (Section 8.1.2.6).
Currently hyper doesn’t do anything to check this. We should add a check in utilities.py (probably somewhere in _reject_pseudo_header_fields()) that we conform to these restrictions.
Agreed, though I suspect that _reject_pseudo_header_fields is getting sufficiently complex that we may want to hoist this specific check out into a separate function.
Though that function may be called by _reject_pseudo_header_fields.
I favour putting it somewhere in _reject_pseudo_header_fields because it already has a list of all the pseudo-header fields in the header block, so restriction 2 falls out easily.
You then check if you see the :method header on the way through with the value CONNECT, and do a lookup in the set of pseudo-headers at the end.
We already have an extract_method_header function, but that requires another iteration over the header blocks, which I’m keen to avoid.
So I'm ok with taking a look at doing that, but it should be noted that we need to aggressively avoid complexity in these validation functions. We also do not do an extra loop over the headers: the reason the validation functions are written as generators is to ensure that we loop exactly once over the headers. Essentially, the validation pipeline could be recomposed as a series of function calls, once per header, along with some persisted state. That means that the real cost of doing this elsewhere is that we duplicate the checks for specific pseudo-headers.
That's not ideal, which is why I'm open to seeing the patch implemented in _reject_pseudo_header_fields, but the performance hit of doing it elsewhere is not so bad if we have to do it.
I had a quick stab at doing that in the awlc/connect-header branch, defining it as an extra validation function and ignoring _reject_pseudo_header_fields entirely. I tried it with the extra function, but you end up making the original more complicated and passing around lots of state.
Feel free to try doing it inline. =)
@Lukasa : I wanted to work on this item. Few questions on the original bug:
- My understanding from the section quoted is that ":authority" pseudo-header is mandatory in CONNECT method. Am I right?
- Is there a method in the library that validates "authority-form" in the value of ":authority" header? Also, do we even need to do this check?
@Lukasa Is there a chance to have the work by @optimusprime01 reviewed? I encountered this issue and I'd like to know it this could be fixed in h2 upstream or we need to maintain a patched version ourselves.