tornado
tornado copied to clipboard
http headers are capitalized
Is there any probability do not capitalize the customized http header. for example: {'authtoken': '1234'} will be capitalized through tornado.httputil.HttpHeaders {'Authtoken': '1234'}. This can bring troubles for some servers.
According to this RFC, header field names are case-insensitive. So, authtoken
and Authtoken
are same and should be treated as such by the client and the server.
But what "troubles for some servers" are you talking about?
@bhch The server I visited dose not treat the header Authtoken
and authtoken
as same way. So I mean I need to keep the original header.
The server I visited dose not treat the header
Authtoken
andauthtoken
as same
That's the fault in their implementation.
But I understand that you need to keep it lower case. You can monkey patch httputil._NormalizedHeaderCache.__missing__
method which is responsible for capitalizing the header names. See source code.
To monkey patch, put this code on top of your main app:
from tornado import httputil
def new_missing(self, key):
if key == 'authtoken':
normalized = 'authtoken'
else:
normalized = "-".join([w.capitalize() for w in key.split("-")])
self[key] = normalized
self.queue.append(key)
if len(self.queue) > self.size:
old_key = self.queue.popleft()
del self[old_key]
return normalized
httputil._NormalizedHeaderCache.__missing__ = new_missing
Hope this helps you. Ultimately, it's up to Ben Darnell (@bdarnell) to decide if Tornado should have this feature built-in or not.
@bhch Thank you for your reply. I have the same idea with you and I have done with overwriting tornado.httputil._normalized_headers
and monkey patch like this.
with mock.patch('tornado.httputil._normalized_headers', _normalized_headers):
the specifc function
In HTTP/1, the spec is clear that headers are case-insensitive, and mixed-case headers are the norm. We can't change this to satisfy applications that mistakenly treat headers as case-sensitive (I've seen other servers that only accept Mixed-Case, so we can't satisfy everyone at once).
HTTP/2 requires all headers to be sent as lowercase, so there is some merit to the idea of switching to all-lowercase even in HTTP/1. Doing so would simplify things a bit, although there is some potential for surprises. I doubt it's worth the disruption that it would cause to make the change now, but if I were starting from scratch I'd go with all-lowercase for headers.
I'm working with an HTTP/2 server now and will need to do some work on the HTTPHeaders class in order for it to work.
So, any preference on how this should be implemented?
I'm thinking on having headers set to lowercase by default but have the option to specify a custom method so users can pass a lambda to handle their own exceptions? global settings? extra and optional param on the class init?
I'm still not sure whether a change like this would be worth the disruption, but I'm sure it's not worth an option. Either leave it as it is and lowercase things when you're passing things from HTTPHeaders to an HTTP/2 implementation (that's what I did for github.com/bdarnell/tornado_http2) or it should be changed to be unconditionally lowercase.
Let me get this straight. You know that your HTTP clients is unusable with HTTP/2 servers and you choose to blame the users rather than accept a configuration flag to control this behavior? How is this acceptable in any way? I don't understand why any reasonable client would even think it should be modifying user input in an uncontrollable way, that seems like a recipe for horrible bugs and usability issues all around.
There's no interoperability issue, and the just-released new HTTP RFCs help explain why. The HTTPHeaders
class is a data structure representing "Fields" as defined in section 5 of RFC 9110, where they are defined as case-insensitive. The HTTP/1.1 serialization defined in RFC 9112 serialization reiterates that fields are case-insensitive. The lowercase requirement comes from HTTP/2 in RFC 9113: "Field names MUST be converted to lowercase when constructing an HTTP/2 message". Lowercasing the field names is the responsibility of the HTTP/2 implementation, just like encoding them with the HPACK compression algorithm (in my implementation of HTTP/2, lowercasing is done as a part of the HPACK code). HTTP/3 is the same as HTTP/2 here.
Now since HTTP/2 requires lowercase and everything else is case-insensitive, it's pretty clear that the right choice in a blank slate implementation is to lowercase everything. But the mixed-case format has been conventional for a long time. The official registry of HTTP field names and many of the examples in RFC 9110 use mixed-case for everything. Changing the internal representation of HTTPHeaders
to lowercase instead of mixed-case might break applications that were (incorrectly) relying on the earlier implementation details. Since the benefit of such a change would be small (it would make life slightly easier for HTTP/2 implementors), I don't think it's worth the churn, nor is it worth the complexity that an option would bring.
You can quite standards all day, it doesn't change the fact that your client can't communicate with some servers, and you're offering no good solution for your users. The reality is that AWS API gateways using the x-api-key
option reject all requests from your client right now. Now I'm happy to agree that Amazon is wrong here, but frankly I can't do anything about that and right now my only options are a monkey patch or using any number of other libraries that don't unconditionally alter the request content before they send it to the server.
Alright, let's start over. Tornado does not include any HTTP/2 support, and I think the talk of HTTP/2 is a red herring. You've got an HTTP/1.1 server that is handling headers in a case-sensitive way, which is rare but a known potential interoperability issue that has come up a handful of times over the last decade. (I was initially skeptical that amazon would make this mistake, but it's right there in the docs).
So what can we do about it? Simply changing the normalization function to lowercase instead of normalized mixed case is a trivial change, but it has difficult-to-evaluate backwards-compatibility implications (as i noted in a previous comment, there are also servers that only accept mixed-case headers, so we'd just trade one incompatibility for another). I have no interest in putting the tornado community through that transition for compatibility with non-compliant web servers unless such servers were much more common than they seem to be (I might make this change for HTTP/2 compatibility when/if we merge HTTP/2 support into mainline). It sounds straightforward to say "just don't modify the request", but the internals of Tornado's HTTP client were built with the assumptions of this implementation; making it case-preserving (without performance penalties) would, I think, be non-trivial. I'd accept a PR for that, but someone would have to do the work. (and then there'd need to be a new release of tornado, which are few and far between these days)
So there's no quick fix here except modifying Tornado (via private fork or monkey patch) or using a different library (and as far as I'm concerned there's nothing wrong with any of those options). Longer term I think it'd be nice if Tornado's HTTP/1.1 client were case-preserving here, but as you can tell from the fact that this issue is 5 years old it's not a priority from my perspective so someone's going to need to scratch their own itch to make this happen.