Valid response header names and values
Currently no validation is required when setting a response HTTP header or value. Should the specification require that invalid values are rejected? Should the specification provide a mechanism for escaping header names and values? What about values that cannot be escaped such as UTF-8 values?
- Issue Imported From: https://github.com/javaee/servlet-spec/issues/26
- Original Issue Raised By:@glassfishrobot
- Original Issue Assigned To: @shingwaichan
@glassfishrobot Commented Reported by markt_asf
@glassfishrobot Commented @shingwaichan said: Adding it to the bucket of FUTURE_RELEASE
@glassfishrobot Commented markt_asf said: My own view is that invalid headers names and/or values should be rejected with an IllegalArgumentException.
@glassfishrobot Commented gregwilkins said: I'm fine with ISE being thrown, but I think the spec cannot define what is or is not a valid header name. That will be determined by the underlying transport and level of RFC implementation. So I think these methods MAY throw rather than MUST throw.
@glassfishrobot Commented @shingwaichan said: I agree that Servlet spec cannot define what valid header names are as they defined by RFCs, etc. And there may be new RFCs in the future. So, the method may throw IllegalArgumentException seems to be better.
@glassfishrobot Commented This issue was imported from java.net JIRA SERVLET_SPEC-26
Thinking specifically about HTTP header values:
From the HTTP spec:
Historically, HTTP has allowed field content with text in the ISO-8859-1 charset [ISO-8859-1], supporting other charsets only through use of [RFC2047] encoding. In practice, most HTTP header field values use only a subset of the US-ASCII charset [USASCII]. Newly defined header fields SHOULD limit their field values to US-ASCII octets. A recipient SHOULD treat other octets in field content (obs-text) as opaque data.
How do we expose that opaque data to applications? getHeaderAsBytes(String headerName) ? Are different encodings used often enough to justify a getHeader(String headerName, Charset encoding) method?
I'm very cautious about going outside of the recommendation to limit header fields to US-ASCII. Allowing arbitrary bytes via arbitrary charset encodings is going to challenge containers to keep responses legal and avoid smuggling attacks. We don't want servlet engines to become attack vectors against poorly written clients.
I have seen zero demand for arbitrary opaque bytes, as only the occasional usage of 8859-1 or question about utf-8.