jetty.project icon indicating copy to clipboard operation
jetty.project copied to clipboard

org.eclipse.jetty.http converts incoming content type "application/json; charset=utf-8" to uppercase charset=UTF-8

Open gjoshi86 opened this issue 1 year ago • 8 comments
trafficstars

Jetty 9.4.50.v20221201

OpenJDK 8u292 (1.8.0_292-b10)

When client sends POST call with Content-Type "application/json; charset=utf-8", it reaches our application which uses Jetty 9.4.50.v20221201 and converts it to "application/json; charset=UTF-8" with uppercase.

I debugged the Jetty-http project and found that org.eclipse.jetty.http.HttpParser class has CACHE field. While parsing Content-Type, it uses getBest() method, to get the best match and returns charset=UTF-8 with uppercase.

I know, I am using older version of Jetty which is end of support. I just need your inputs on following queries.

  1. Need to know why it is returning the uppercase UTF-8, even if client has send with utf-8 lowercase?
  2. What are the implication of setting org.eclipse.jetty.http.HttpParser.STRICT to true which is compliance mode = LEGACY
  3. Are there any other ways, we can get the UTF-8 in same format it was sent in the request from the client?

gjoshi86 avatar Sep 12 '24 22:09 gjoshi86

Hi, Just to let you know Jetty 9.x is EOL see https://github.com/jetty/jetty.project/issues/7958 Jetty 10/11 is EOL as well https://github.com/jetty/jetty.project/issues/10485

Can you try to reproduce your issue with Jetty 12?

For commercial support of Jetty, see above listed issues.

olamy avatar Sep 12 '24 22:09 olamy

I think you have answered your own question. It is a case insensitive cache of common header values. There are compliance modes that you can use to bypass the cache and keep the case.... But you should not need to add charsets should be case insensitive.

Note there are fine grained compliance mode controls, so you don't need to go all the way to fill Legacy mode.

That's about all we can say for an end of life release

gregw avatar Sep 13 '24 06:09 gregw

Also note, that the mime-type application/json has no charset, and using a charset on it has no meaning. It is always UTF-8, 100% of the time, in all cases.

joakime avatar Sep 13 '24 16:09 joakime

@gregw @joakime Thank you for your response! This is helpful.

I have couple of questions before I close this ticket. 1. I just need confirmation that the CACHE implementation in org.eclipse.jetty.http.HttpParser is for performance optimization. Is that right? 2. I have a question around "Note there are fine grained compliance mode controls, so you don't need to go all the way to fill Legacy mode." - I tried different compliance mode like RFC7230, RFC2616 etc but it works only in case of LEGACY compliance mode. I think the property (org.eclipse.jetty.http.HttpParser.STRICT = true) kicks in only in case if LEGACY compliance mode. Secondly, Is it possible to set LEGACY mode only for specific header like Content-Type?

gjoshi86 avatar Sep 20 '24 15:09 gjoshi86

@gjoshi86 The cache is indeed an optimization to avoid many copies of the same string being created and also to allow fast lookup of the actual semantics.

For fine grained compliance in jetty-9, you will need to use one of the CUSTOM modes configured with a system property. See the HttpCompliance class for more detail

gregw avatar Sep 23 '24 02:09 gregw

@gregw We have similar situation where we need to support LEGACY mode only for HttpComplianceSection.CASE_INSENSITIVE_FIELD_VALUE_CACHE. How can we use the CUSTOM mode to support this? If possible, please share an example.

Also just for my understanding will you please share the reason for choosing upper case to store content types in cache instead of lower case? I have observed that most of the older APIs use lower case for Content types. Hence looking for reason, if any.

seemasjoshi avatar Oct 08 '24 19:10 seemasjoshi

@seemasjoshi see https://jetty.org/docs/jetty/12/programming-guide/server/compliance.html for example.

See https://javadoc.jetty.org/jetty-12/org/eclipse/jetty/http/UriCompliance.html#from(java.lang.String) for details about the String syntax.

joakime avatar Oct 09 '24 20:10 joakime

Thank you! I will try these examples.

It will be helpful if you can also share the reasoning behind the design choice of storing upper case values in cache instead of lower case. This will help us better communicate the change with our customers and ensure to align with best practices.

seemasjoshi avatar Oct 16 '24 16:10 seemasjoshi

Closing as answered

joakime avatar Feb 20 '25 17:02 joakime