http-client icon indicating copy to clipboard operation
http-client copied to clipboard

Content-Type: application/json; charset=utf-8 can cause issues and may be incorrect

Open jbrechtel opened this issue 11 months ago • 2 comments

setRequestBodyJSON appends a "; charset=utf-8" to the end of the Content-Type header and this causes some libraries like

https://hc.apache.org/httpcomponents-core-4.4.x/current/httpcore/apidocs/org/apache/http/entity/ContentType.html

to fail to parse such HTTP requests.

It seems like the charset parameter is at best redundant (since JSON must be UTF-8) and at worst incorrect entirely according to the RFC - see the conversation here https://stackoverflow.com/questions/9254891/what-does-content-type-application-json-charset-utf-8-really-mean

I suggest removing it entirely from setRequestBodyJSON and only set Content-Type to application/json. If that's OK then I'm happy to submit a PR.

jbrechtel avatar Feb 29 '24 14:02 jbrechtel

since JSON must be UTF-8

That's not historically true, per that SO discussion.

I haven't seen anything indicating that ; charset=utf-8 is in violation of any spec here. I'd be worried about breaking someone else's workflow, though that seems unlikely. Nonetheless, changing the default behavior because one implementation is having a problem isn't something I'd be happy about without seeing some clear RFCs specifying that ; charset=utf-8 isn't allowed here.

snoyberg avatar Mar 04 '24 08:03 snoyberg

I read "Unicode" as "UTF-8" in this exchange and thought it was more clear-cut even historically -- oops.

Re: Breaking others, that's fair. I'd still suggest that an, at best, redundant parameter that is problematic for a specific (and widely used HTTP implementation) is best left off.

At any rate - we can (and have) obviously work around it by just replacing the Content-Type header with one lacking the problematic charset parameter.

Hopefully at least anyone else running into this problem can find this issue.

jbrechtel avatar Mar 07 '24 19:03 jbrechtel