sttp icon indicating copy to clipboard operation
sttp copied to clipboard

AkkaHttp backend modifies content-type header "application/json" with encoding

Open slavaschmidt opened this issue 2 years ago • 5 comments

It looks like AkkaHttp backend removes charset part of the content-type if the content type is "application/json".

Please compare:

basicRequest.contentType(MediaType.ApplicationJson.charset("UTF-8"))

produces

User-Agent: akka-http/10.2.9
Content-Type: application/json

Custom content type basicRequest.contentType("application/test", "utf-8") produces expected result:

User-Agent: akka-http/10.2.9
Content-Type: application/test; charset=UTF-8

I can imagine that the motivation for removing the charset in the first case is because this is the default but in some cases it is crucial to be as specific as possible, for example because of specific server-side requirements.

slavaschmidt avatar May 25 '22 13:05 slavaschmidt

Hi, From what I see it seems like akka-http issues in general as it treat application/json as MediaType with fixed charset and drop any provided by user. Here you can see it clearly: image It is a snippet from akka.http.scaladsl.model.ContentType and we are using this method in sttp.

I am not sure if it is even a bug as here https://www.iana.org/assignments/media-types/application/json you can read that there is no charset parameter defined for application/json at all.

Pask423 avatar May 26 '22 18:05 Pask423

Hi!

Yes, you're absolutely right, this is something where akka-http "knows better" and yes, I do agree that the standard you mentioned does define "application/json" without an encoding parameter.

It would be a nice world if standards were obeyed :D

There are other RFCs and standards, less general that state that content-type header can contain encoding part, such as https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Type and https://datatracker.ietf.org/doc/html/rfc7231#section-3.1.1.5 . I do understand that they are less specific and the one mentioned by you shall be preferred.

Unfortunately some libraries/frameworks implement less specific approach and in this case it would be good to satisfy their requirements even when these are not 100% standard-conform. In other words, the library should not "know better".

For example, currently I must communicate with an API that supports few content-types and a lot of functionality is linked to the encoding parameter. For simplicity on their side they require the encoding to always be defined. And with current implementation there is literally no possibility to do so, except for the customizeRequest defined for the whole akka-http backend.

With akka-http directly it is not strait-forward but possible like

val contentType: ContentType.WithFixedCharset = MediaTypes.`application/json`.withParams(Map("charset" -> charset)).toContentType
request.withEntity(request.entity.withContentType(contentType))

So maybe it would make sense to be explicit and apply withParams in the sttp BodyToAkka#ctWithCharset in the case if charset is defined in the request?

slavaschmidt avatar May 27 '22 06:05 slavaschmidt

@slavaschmidt unfortunately in the current design we are not able to reliably distinguish between an explicitly-set content type, and a default one coming from a body, as encoding in StringBody is non-optional. Hence changing this would mean that all json requests would have the charset appended.

I agree this is a shortcoming of sttp but I think this can only be fixed by breaking binary compatibility (so will have to wait until sttp-client4)

adamw avatar May 30 '22 07:05 adamw

Related discussion: https://github.com/akka/akka-http/issues/1482 akka-http follows the "be generous in what you consume but strict in what you produce" principle. Therefore it does not produce HTTP messages with standard-violating Content-Type headers. The discussion around application/json encodings came up man times in the last 10 years (that's how old akka-http now is). I've come to regard application/json as a kind of binary Content-Type, which just happens to be human-readable. It's not really a text-based Content-Type because there is no flexibility with regard to the encoding. UTF-8 is simply set in stone.

sirthias avatar Jul 18 '22 11:07 sirthias

Hey @sirthias , thank you for jumping into the discussion. As I mentioned above, the akka-http approach is strict and it is understandable that framework developers strive for purity in regard to the standard.

Still, there are few things worth mentioning:

  1. The standard does not forbid defining the encoding for application/json content-type, it merely states it is meaningless (the link you provided puts it nicely). This is important in my case because I have intermediary that is blind in regard to content-type but requires encoding to be explicitly defined. For sure, in the case of application/json it will always be UTF-8, the point is, I need to have such possibility and in this case it is standard compliant.
  2. akka-http still allows to define any encoding for application/json, it just needs to be done in a quirky way and this "workaround" is not what sttp (understandable) counts for
  3. On a broader scope, it is sad if library does not allow for solving a problem because library developers decided that some problem is just not a right problem to solve (and of course library developers have absolute right to decide whatever they want to). I'm using async-http-client now instead of akka-http-client because my problem "is not right", huge thanks to the flexibility of sttp, still, one user less for akka-http-client.

slavaschmidt avatar Jul 18 '22 13:07 slavaschmidt