charset icon indicating copy to clipboard operation
charset copied to clipboard

Why is utf-8 a special case

Open kierans opened this issue 6 years ago • 2 comments

If the charset is "utf-8" "utf8" is returned. If the charset is anything else (eg: "utf-16") it is returned with a dash.

Having inconsistent formatting of results is not helpful as it's hard what to know what to test against. For example, I'm writing https://github.com/chaijs/chai-http/pull/253 and now have to "despecial" utf8

In my opinion the charset should just be returned.

kierans avatar Jun 29 '19 03:06 kierans

I don't remember why I force to change utf-8 to utf8, but I think they are same in Node.js. Maybe I was following the fs module default encoding format is utf8. https://npm.taobao.org/mirrors/node/latest/docs/api/fs.html#fs_fs_readfile_path_options_callback

fengmk2 avatar Jun 29 '19 03:06 fengmk2

Thanks @fengmk2. It's annoying that node specifies encoding different to HTTP. However in HTTP, charsets have dashes https://en.wikipedia.org/wiki/List_of_HTTP_header_fields

So you either need to convert all UTF response to non dash (ie: "utf8, utf16, etc), or return the dashed version that is in the HTTP response and let the caller do the matching.

It's the inconsistency that's the real problem.

kierans avatar Jun 29 '19 03:06 kierans